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Abstract — Multiple-input double-output (MIDO) codes 
are important in the near-future wireless communications, 
where the portable end-user device is physically small and 
will typically contain at most two receive antennas. Espe- 
cially tempting is the 4x2 channel due to its immediate 
applicability in the digital video broadcasting (DVB). Such 
channels optimally employ rate-two space-time (ST) codes 
consisting of (4 x 4) matrices. Unfortunately, such codes 
are in general very complex to decode, hence setting forth 
a call for constructions with reduced complexity. 

Recently, some reduced complexity constructions have 
been proposed, but they have mainly been based on 
different ad hoc methods and have resulted in isolated 
examples rather than in a more general class of codes. 
In this paper, it will be shown that a family of division 
algebra based MIDO codes will always result in at least 
37.5% worst-case complexity reduction, while maintaining 
full diversity and, for the first time, the non-vanishing 
determinant (NVD) property. The reduction follows from 
the fact that, similarly to the Alamouti code, the codes will 
be subsets of matrix rings of the Hamiltonian quaternions, 
hence allowing simplified decoding. At the moment, such 
reductions are among the best known for rate-two MIDO 
codes [4], [5]. Several explicit constructions are presented 
and shown to have excellent performance through com- 
puter simulations. 

Index Terms — Coding gain, cyclic division algebra, dig- 
ital video broadcasting next generation handheld (DVB- 
NGH), fast maximum-likelihood (ML) sphere decoding, 
Hamiltonian quaternions, Hasse invariants, lattices, low- 
complexity space-time block codes (STBCs), multiple-input 
single/double/multiple-output (MISO/MIDO/MIMO), non- 
vanishing determinant (NVD), orders. 

I. Introduction 

Among known space-time codes, the Alamouti code 
[6] and the fully diverse 4x1 quasi-orthogonal codes 
[7] stand out due to their orthogonality properties that 
are beneficial for decoding. Both of these codes however 
have a low code rate, hence best suitable for an asymmet- 
ric transmission, where there are less receive antennas 
than transmit antennas. It is far from obvious how to 
generalize these codes to asymmetric scenarios where 

Part of this work appeared at ISIT 2010 [1], at SPCOM 2010 [2], 
and at ISITA 2010 [3]. 



we demand higher code rates and different number of 
antennas. On the other hand, the now well known cyclic 
division algebra (CDA) codes designed for a symmetric 
transmission have full rate and are generalizable to an 
arbitrary number of antennas. Unfortunately, they are 
very complex to decode, especially when we have less 
receive antennas than transmit antennas. Yet there is a 
strong demand for asymmetric codes that would be fast- 
decodable, generalizable to more antennas, and would 
support higher rates. The special case of two receive 
antennas is referred to as a multiple input-double output 
(MIDO) code. 

For example one of the most interesting wireless appli- 
cations currently is the design of 4x2 MIDO codes. Such 
asymmetric systems can be used in the communication 
between, for instance, a TV broadcasting station and a 
portable digital TV device. The four transmitters can 
either be all at one station or separated between two 
different stations in this way providing better coverage 
in the case when the transmission of one of the stations 
is blocked out by a deep shadow. 

In Europe, the digital video broadcasting (DVB) con- 
sortium has adopted different standards for terrestrial 
(DVB-T) fixed reception, handheld (DVB-H) reception, 
satellite (DVB-S) reception as well as an hybrid re- 
ception like DVB-SH. The ongoing work towards the 
standardization of the DVB Next Generation Handheld 
(NHG, see the DVB Project's web page [8] for more 
information) systems is bringing this topic ever more to 
the forefront of current MIMO research. The inclusion of 
the 4x2 systems in the consortium's call for technologies 
for the DVB-NGH indicates having a MIDO code in the 
coming standard. 

One solution to the 4x2 code construction problem 
could be to use a full-rate CDA code, e.g. the 4x4 
Perfect code [9]. However, when received with two 
antennas, a rate-four code cannot be optimally decoded 
with a linear decoder such as a sphere decoder. Codes 
especially designed for the 4x2 channel have been 
proposed in e.g. [10], [11], [12], but all the codes require 
high complexity maximum-likelihood (ML) decoding, 
namely full-dimensional sphere decoding. 

A natural approach to this design problem is to imitate 
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the form of the code matrices of the already known fast- 
decodable codes or use these codes as building blocks for 
higher rate codes. The key problem in such constructions 
is that it is very hard to guarantee that the resulting 
code will still have good performance, thus in many 
cases requiring optimization to be carried out through 
extensive computer searches. 

In this paper we are going to adopt a different ap- 
proach to this problem. We study the algebraic structure 
of known fast-decodable codes like the Alamouti code 
and the division algebra based quasi-orthogonal codes. 
By analyzing the relation between the Hasse-invariants 
and the geometric structure of these codes we are able 
to distill the key algebraic properties that force these 
codes to be fast-decodable. This approach then depicts 
an infinite family of fast-decodable codes from division 
algebras. 

The main advantage of our take on this subject is 
that the proposed codes are based on orders of division 
algebras and therefore they are not only fast-decodable, 
but are also guaranteed to have full-diversity, the non- 
vanishing determinant (NVD) property, and further allow 
us to perform algebraic minimum determinant optimiza- 
tion. We can show, under given conditions, that the ML 
decoding complexity of a MIDO code will always be 
reduced by at least 37.5%, while maintaining the NVD. 
Explicit constructions based on the proposed criteria will 
be provided. One of the examples introduces a code that 
has comparable performance with the best known fast- 
decodable ST codes [4], [5] and further has (provable) 
NVD. The proposed theory provides fully diverse, fast- 
decodable (FD) codes with the NVD property for any 
even number nt of Tx antennas and any code rate 
< n t /2. Motivated by the DVB-NGH, most of the 
examples are given in the case of 4 Tx antennas and 
2 Rx antennas. 

We make the typical assumption of transmission over 
a coherent i.i.d. Rayleigh fading channel with perfect 
channel state information at the receiver (CSIR) and with 
no CSIT, 

Y = HX + N, 

where Y, X, H, N are the received, transmitted, channel, 
and the Gaussian noise matrix, respectively. The ST 
matrix X G M nt (C), while Y, H, N G M nrXnt {C), 
where n t (resp. n r ) denotes the number of transmit (resp. 
receive) antennas. We assume no correlation, but in the 
correlated case the transmitter can adapt to the rate- 
one code naturally embedded within the proposed codes 
while maintaining and even improving fast decodability. 



A. Related work 

The first reduced ML-complexity 4x2 construction 
was given in [4], combining two copies of a quasi- 
orthogonal code [13]. This resulted in a MIDO code that 
does have lower decoding complexity, but unfortunately 
does not have full rank. Nevertheless, good performance 
is still achieved at low-to-moderate SNRs and with four 
real dimensions less in the sphere decoder. 

The most recent results on fast-decodable codes have 
appeared in [5], where new constructions with optimized 
performance have been presented, and in [1], [2], [3], 
where fast-decodable codes with the NVD property have 
been built from crossed product and cyclic presentations 
of division algebras. In the preprint [14] the authors 
consider quadratic forms as a tool for characterizing 
the decoding complexity, and in the preprint [15] multi- 
group ML-decodable collocated and distributed space- 
time codes are proposed. 

B. Organization and contributions 

The rest of the paper is organized as follows. We 
start by giving some background on space-time codes 
with a lattice structure and their decoding via sphere 
decoding in Section II. The concept of fast decodability 
is then defined and illustrated in Section III, where the 
role of the Alamouti code is emphasized. To pursue the 
study of fast-decodable codes, we then focus on CDA 
codes in Section IV, where some background and further 
motivating examples are presented, translating fast de- 
codability into being able to embed the considered cyclic 
algebra into an algebra of matrices with quaternionic 
coefficients. The conditions guaranteeing the existence 
of such an embedding are studied in Section V: we 
need an algebra whose center is totally real and such 
that all its infinite places ramify in the algebra. A 
family of such cyclic algebras is provided. A last design 
criterion, the normalized minimum determinant, is added 
and bounds on optimal lattice codes with respect to it are 
computed in Section VI. Different explicit construction 
methods are described in Section VII. Finally, several 
code constructions are presented in Section VIII for 4 x 2 
codes followed by simulation results in Section IX. In 
Section X the results are extended for more transmit 
antennas and explicit constructions are provided for 6 x 3 
and 6x2 codes. 

Further generalizations are provided in Section XI, 
where it is also shown that the existence result can 
be made explicit via conjugations of the familiar left- 
regular representation. Section XII concludes the paper. 
In Appendix, relevant algebraic results related to central 
simple algebras and Hasse invariants are presented. 
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The main contributions of this paper are listed below. 

• General methods to produce space-time lattice 
codes with the NVD property and given geometric 
structure are given. 

• A unified construction of families of CDAs that can 
be embedded into matrix rings of the Hamiltonian 
quaternions Mfc(H) is provided. The underlying 
algebraic principles are studied in full detail. It 
is then demonstrated how such a structure can be 
beneficial in the decoding. The generality of the 
constructions is in contrast to the present ad hoc 
constructions available in the literature. 

• A complete solution to the discriminant minimiza- 
tion problem [16] for division algebras with arbi- 
trary centers is given. As an application a normal- 
ized minimum determinant bound for code lattices 
in Mfc(H) is derived from the algebraic results. 

• We mainly consider the 4x2 MIDO case, but also 
provide constructions for the 6 x 2 and 6x3 cases. 
The methods are generalizable to any even number 
of Tx antennas. 

• The main difference with other fast-decodable 
MIDO codes is that all the proposed codes have 
the NVD property. The proofs for the NVD are 
based on the underlying algebraic structure of the 
code and hold for infinite constellations. This can 
be seen as an improvement for [5], where the 
NVD is conjectured by computing the minimum 
determinant for certain finite QAM alphabets. 

• We build explicit codes that have 25-37.5% reduced 
decoding complexity for general constellations, and 
whose performance is comparable to the best known 
MIDO codes. Such complexity is among the best 
known for the MIDO channel, and can be further 
reduced by using a symmetric alphabet - a square 
QAM alphabet, for instance. No fast-decodable 
MIDO codes with provable NVD other than the 
ones in this paper have been reported. 

C. Notations 

Throughout the paper, we will use the following 
notations: 

• Tx for transmit antennas, Rx for receive antennas, 

• n t xn r for a channel with n t Tx and n r Rx antennas, 

• (n x k) for matrix dimensions, 

• boldface lowercase letters for vectors, e.g. g = 
(#1, . . . ,g t ) or g = (51, . . . ,g t ) T , 

• capital letters for matrices, e.g. X or M, 

• x* for the complex conjugate of x, X* for element- 
wise conjugation in a matrix X, and X^ for the 
Hermitian conjugate of X, 



• calligraphic letters for algebras, e.g. A, 

• E/K for number field extensions and a for the 
generator of a cyclic Galois group Gel(E/K). Note 
that K is also used for the rank of a lattice in 
some instances, but this should cause no danger of 
confusion. 

• The field norm from E to K is denoted by 

N e /k{x) = xcr(x) ■ ■ ■ a n ~ l (x) G K, 
where n = #Gal(E/K). 

II. Space-time lattice codes 

We start with as general a definition of a space- 
time code as possible, and motivate why we focus our 
attention to space-time lattice codes, which furthermore 
can be decoded via sphere decoder, a universal decoder 
for lattice codes. We explain in detail how this is done. 

A. Definitions 

Abstractly, a space-time codeword X is an (n x k) 
matrix with coefficients in C, where n corresponds to 
the number of transmit antennas, and k is the coherence 
time (or delay) during which the channel is assumed 
constant. We will, in this paper, concentrate on the case 
k = n, so that a space-time code is a square matrix, 
corresponding to minimum delay codes. 

Definition 2.1: A space-time code C is a set of (n x n) 
complex matrices. We often use the abbreviation STBC 
for space-time block code. 

The space M n (C) of (n x n) matrices with complex 
coefficients is a vector space of dimension 

dim R (M n (C)) = 2n 2 

over the reals. Therefore, for every code C C M n (C), we 
can consider, following [15], the subspace (C) spanned 
by the matrices of C. It has an M-basis consisting of K 
matrices, 1 < K < 2n 2 , so that each matrix X in C can 
be uniquely written as 

K 

X = Y,9iBi, (1) 

i=l 

where Bi are some basis matrices and Qi are real num- 
bers. Once the basis matrices {B±, . . . , Bk} are given, 
a space-time code C is defined by the values that gi, 
i = 1, . . . , K, can take. We write 

g = {91, ■■■ ,9k) 
and let g take its values in Q C R K , so that 

K 

C = {J2g i B i \g = (g 1 ,...,g K )eg}. (2) 
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Typically, Q corresponds to a choice of constellation 
points. For example, if a size Q pulse amplitude modu- 
lation (Q-PAM) is used, then Q is the Cartesian product 
of K times 

{-Q + l,... ,-3,-1,1,3,. ..,Q-1}, 

where Q > 2,2\Q. The formulation in (2) is not without 
recalling the notion of linear dispersion codes [17], 
where codewords X are similarly described by a family 
of dispersion matrices {A\, . . . , Ak}'- X = YliLi Qi^i, 
for some coefficients gi belonging to a symmetric set. 
The critical difference is in {Si, ... , Bk} being linearly 
independent, and thus really forming an M-basis for (C). 
It consequently makes sense to speak of dimension of 
(C), which yields the following definition of rate [15]: 

Definition 2.2: The dimension rate R\ of the code C 
is given by 

D dim R «C» K 
n n 
(real) dimensions per channel use. 

Since 1 < K < 2n 2 , we immediately see that the 
maximum rate achievable for square matrices is 2n. 
One should note that this is not the common definition 
of a code rate (also used in this paper until now), 
which usually counts how many complex symbols (e.g. 
QAM symbols) are transmitted in a codeword. With our 
notation, the common code rate would be R\/2 < n. 

The data rate in bits per channel use (bpcu) is defined 
as follows. 

Definition 2.3: The bit rate i?2 of the code C is 

R log 2 (|C|) 
n 

bpcu. 

While the above considerations have been done in 
full generality, several years of research on space-time 
coding have shown that good space-time codes enjoy 
special properties. Following [18], getting fully diverse 
codes has become the first code design criterion. That 
is, we require 

det(X -X')^0, X ^ X' € C. (3) 

From [19] it is known that the best way to actually deal 
with this constraint is to first assume that the space-time 
code considered forms an additive group, so that 

X±X' £C, (4) 

which simplifies (3) to 

det(X) / 0, 1^0, 

a much more tractable constraint. We note that C as 
defined in (2) is not necessarily linear, but of course (C) 



is. From the linearity imposed on C by (4), we are only 
one step away from having a space-time lattice code. 
Recall that 

Proposition 2.1: An infinite discrete group of matri- 
ces in M n (C) is a lattice. 

We can thus safely assume that infinite space-time 
codes have a lattice structure, since the discreteness 
condition can be translated by asking the Euclidean 
distance between each pair of codewords to be greater 
than r, for a fixed non-zero r. This formalizes the natural 
assumption that codewords should not be chosen too 
close to each other. 

Definition 2.4: A space-time lattice code C C M„(C) 
has the form 

ZBi (£> ZB 2 • • • © 1B K , 

where the matrices B\ , . . . , Bk are linearly independent, 
i.e., form a lattice basis, and K is called the rank of the 
lattice. We may also call K the dimension of the code, 
but do not confuse this with the dimension of the lattice. 

For the actual transmission, a finite subset of code- 
words from C is picked by restricting the integer coef- 
ficients to some set Q, as in (2). From now on, we will 
consider only space-time lattice codes and may call them 
space-time codes for short. 

As recalled above, full diversity is the first design 
criterion for space-time codes. Once achieved, meaning 
for lattice codes that 

det(X) / 0, I/O, 

the next criterion is to maximize the minimum determi- 
nant of the code. 

Definition 2.5: The minimum determinant det m j n (C) 
of a space-time code C C M„(C) is defined to be 

det mi „ (C) = htf | detOX - ) |, X e C. 

Definition 2.6: [20] If the minimum determinant of 
the lattice is non-zero, we say that the code has a non- 
vanishing determinant (NVD) . 

The NVD property means that, prior to SNR normal- 
ization, the lower bound on the minimum determinant 
does not depend on the size of the constellation used. 

B. Sphere decoding 

Let X be a space-time lattice codeword. We can flatten 
X € M„(C) to obtain a 2n 2 -dimensional real vector 
x by first forming a vector of length n 2 out of the 
entries (e.g. row by row, or vectorizing that is column by 
column) and then replacing each complex entry with the 
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pair formed by its real and imaginary parts. This defines 
a mapping a from M n (C) to M 2 " 2 : 

a : X h-> x = a(X) (5) 

which is clearly R-linear: 

a(rX + r'X') = ra(X) + r'a(X'), r, r' £ R. (6) 



Let | |A| \f = y/Tr(X^ X) denote the Frobenius norm of 
X. Note that the following equality holds: 



\X\ 



\ 



EE 

i=i j=i 



12 — 



(7) 



where || • \\e denotes the Euclidean norm of a vector. 
This makes a an isometry. 

The space-time code X £ M„(C) is transmitted over 
a coherent Rayleigh fading channel with perfect channel 
state information at the receiver (CSIR): 

Y = HX + V, 

where H is the channel matrix and V is the Gaussian 
noise at the receiver. Maximum-likelihood (ML) decod- 
ing consists of finding the codeword X that achieves the 
minimum of the squared Frobenius norm 



d(X) 



\Y - HX\\ 2 F . 



(8) 



This search can be performed using a real sphere decoder 
(see e.g. [21]). Since this paper focuses on MIDO codes 
and for the sake of simplicity, we will now exemplify 
the computation of a (4 x 4) MIDO code matrix X, that 
is, we consider 4 Tx antennas and 2 Rx antennas and 
the channel 



Y-, 



2x4 



#2x4-^4x4 + ^2x4- 



(9) 



A (4x4) MIDO code can transmit up to 8 complex (say 
QAM) information symbols, or equivalently 16 real (say 
PAM) information symbols. Following (2), the encoding 
can thus be written as mapping the PAM vector 



into a (4 x 4) matrix 



16 



X = Y J 9iB i , 

i=i 

where the basis matrices B, L , i = 1, . . . , 16, define the 
code. Let us emphasize again that by basis matrices, we 
really mean a Z-basis of the code seen as a lattice. From 
(9), the received matrix Y can be expressed as 

16 16 

E 2x4 = H(J29iBi) + V = Y,9i{HBi) + V. 



In order to perform real sphere decoding, we have to 
transform this complex channel equation into a real one, 
which can be done via the mapping a defined in (5). The 
matrix Y 2X 4 = {yi,j) can be turned into a real valued 
vector y in M 16 by the transformation 



«(E)=y=[yi,y 2 r j 



with 



yi = 0%i,i), ■ ■ ■ , £(yi,4), 3(1/1,4)) 

y 2 = mwriMy*,!), ■ ■ ■ , K(ifc,4), 9(2/2,4)). 

The matrices HBi G M4 X2 (C) are then similarly turned 
into vectors bj £ M 16 : 

a(HBi) = bi, i = 1, . . . , 16, 

so that d(X) can be expressed as 

d(X) = \\Y-HX\\\ by (8) 



HX)\\% 



a(HX)\\% by (6) 



= \HY ■ 

= MY) 

1 1 v^l6 1 1 1 2 

From this we finally get 

d(X) = \\y-Bg\\ 2 E , 



by (7) 



(10) 



where 



B = (t>i,b 2 , . . . ,bi 6 ) £ Mi 6x i6 



This shows that the decoding of a space-time lattice 
code C with a basis {-Bi, . . . , Bk} is equivalent to the 
decoding of a 16-dimensional real lattice A(C) described 
by the generator matrix B: A(C) = {x = Bg | g £ Z n }. 

III. Fast-decodable space-time codes 

We are now ready to explain the notion of fast de- 
codability of space-time lattice codes when using sphere 
decoding. We will then give a few examples that will 
motivate the rest of the paper. 

A. Fast sphere decoding 

The first step of the sphere decoder is to perform a QR 
decomposition of the lattice generator matrix B, B = 
QR, with Q^Q = I, to reduce the computation of 



d(X) 



as in (10) to 

d(X) = 



QRgWl 



BS\\% 



(ii) 



i=i 



i=i 



where R is an upper right triangular matrix. The number 
and position of non-zero elements in the upper right 
part of R will determine the complexity of the sphere 
decoding process [4], [5]. 
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The worst case is of course given when the matrix 
R is a full upper right triangular matrix. This motivates 
the following definition of worst case sphere decoding 
complexity: 

Definition 3.1: [4, Def. 2] Let S denote the real alpha- 
bet in use, and let re be the number of independent real 
information symbols from S within one code matrix. 
The ML decoding complexity is the minimum number 
of values of d(X) in (11) that should be computed 
while performing ML decoding. This number cannot 
exceed |S'| K , the complexity of the exhaustive-search ML 
decoder (or \S\ K / 2 for a complex alphabet S). 

Definition 3.2: The exponent re (resp. re/2) is referred 
to as the dimension of a real (resp. complex) sphere 
decoder. If the structure of the code is such that re 
decreases, we say that the code is fast-decodable. In this 
paper, we always refer to the dimension of a real sphere 
decoder. 

In the MIDO case (9), where S is a real PAM 
alphabet (and hence \S\ is the number of PAM symbols 
in use), the worst case complexity is \S\ 16 . A typical 
improvement in re can be obtained if the left upper corner 
of the matrix 

R 2 ' 1 R 2 > 2 

from the QR decomposition of B has the form 



R 



R 



1,1 



f * 
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(12) 



where * denotes any non-zero element. Indeed, in this 
case: 

1) We start the sphere decoding by going through 
every combination of the 8 last real symbols 
g$ , . . . , gie (we are not choosing the ones that give 
the minimal metric yet, we go through all the op- 
tions since we do not know how the last 8 symbols 
will affect the total minimization problem). This 
corresponds to treating the matrix R 2 ' 2 , and has 
cost \S\ 8 . 

2) We then look at the first 8 symbols gi,...,g%, 
corresponding to the matrix R 1,1 , and for every 
possible choice of 8-tuples, (g$, . . . ,gie), we de- 
code separately g\ , . . . , g± and 55 , . . . , g% thanks to 
the structure of R 1,1 , which has complexity 2|S'| 4 . 

Altogether, the above structure allows to decode the 
PAM symbols g\ , g 2 , 93 , 94 independently of the symbols 



99, 510; 9n, 912, yielding a worst case complexity of 
|5| 12 (or more precisely 2|S'| 12 ) for the real sphere 
decoding process instead of the full complexity order 
of \S\ 16 . 

The natural question to ask is thus the design of codes 
(that is, of the basis matrices Bi) that yield a sparse 
matrix R. To address this question, we further study 
the structure of the matrix R. By definition of the QR 
decomposition of the matrix B = (bi, . . . , b^), we have 
that 

/ (ei,bi) (ei,b 2 ) ... (ei,bi 6 ) \ 
(e 2 ,b 2 ) ... (e 2 ,bi 6 ; 
R= (e 3 ,b 16 ) 















(ei 6 ,bi 6 ) / 



where 



ei 



e 2 



efc 



|bi| 
b 2 



proj 



ei 



and 



|b 2 - proj ei b 2 | 



bfc ~ EjLiPrcjejbj 
||b fc - £*=iProj e:j bj| 

proj e b = -e. 

(e,e) 



The notation (•,-) stands for the usual inner product. 
Thus having the upper left part of R to look like (12) 
means that 

(b;,b,,)=0, 1<»<4, 5<j<8, 

or equivalently, by recalling that b« = a(HBi) 

= {a(HBi) : a{HBj)) = %l(Tr(H Bi(H Bj)^ )) . 

The second equality is true in general and can be shown 
by a direct computation: 

(a(A),a(B)} = K(Tr(AB*)). (13) 

We have now connected the decoding complexity to 
the code design. The above computations showed that if 
the 16 basis matrices Bi, . . . , B±q satisfy 

= ^{^(HBiiHBj)^)), 1 < i < 4, 5 < j < 8, 

the worst case sphere decoding complexity is of the order 
of IS"] 12 . This suggests further improvement: the current 
process manages to separate the information symbols 
into two groups, which could be repeated. Assume that 
we could further have 

= ^{Tr{HBi{HBrf)), 1 < i < 2, 3 < j < 4 
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and 



= $l(Tr(HBi(HBjy)), 5 < i < 6, 7 < j < 8. 

3) As earlier, we start the sphere decoding with the 
matrix R 2 ' 2 and go through all the possibilites for 
the 8 last real symbols gg,...,gie, for a cost of 
\S\*. 

4) For the first 8 symbols g\ , . . . , g% corresponding 
to the matrix R 1 ' 1 , we first separate gi, . . . , g 4 and 
<75, ... ,g% , after which we decode independently 

{91,92}, {93,94}, {95,9g} and {g 7 ,9s}, each of 
these costing \S\ 2 . 

The worst case complexity is then 4|S'| 8 |S'| 2 = 4|S'| 10 . 

Remark 3.1: It is possible to further reduce the (ML) 
complexity by using the so-called hard-limiting, see [5, 
Section VI, p. 924 (1-2)]. In this case, the complexity 
will be 4|5| 45 , where \S\ is the size of a complex 
signal constellation. However, this is only possible when 
a square constellation (e.g. Q 2 -QAM) can be employed, 
i.e., the constellation is a cartesian product of two real 
constellations (e.g. Q-PAM). 

B. Examples from the ring of Hamiltonian quaternions 

To illustrate the material explained above, let us start 
with the Alamouti code [6], i.e., codewords of the form 

' 9\ + i92 -53 + ig4 

. 93 + m gi - m 

where x\,x 2 are QAM symbols and g = {gi,g 2 ,g3,g4) 
is the PAM symbol vector. A decomposition into basis 
matrices B\, B 2 , B%, B 4 is given by 



x ={ 


' X1 


™* \ 


)-( 




v X 2 


X l J 





X = giBx + g 2 B 2 + gzB 3 + g 4 B 4 , 



where 



B l 



1 
1 



Bo 



i 
-i 



We assume transmission through a MISO channel de- 
scribed by the vector 

H = (h 1 ,h 2 ) 
so that a(HBi), i = 1,2,3,4, is given by 

bi = <*(##!) = (^(/H),^^!),^),^^)) 71 , 

b 2 = a(HB 2 ) = (-^(/n),^!),^^),-^)) 7 , 
h 3 = a(HB 3 ) = (3t(/i 2 ),9(/ l2 ),-K(/ ll ),-9(/ il )) T , 
h 4 = a(HB 4 ) = (-Q(h 2 )Mh2),-^(h 1 )Mhi)) T - 



We finally get 

B = a(HX) = [bi,b2,b3,b 4 ], 

and since (hi, hj) = for i ^ j, the QR decomposition 
of B is of the form 



B 



1 



B (cl 4 ) = QR, 



where 



c = VK(/ii) 2 + %(hi) 2 + ^{h 2 ) 2 + %(h 2 ) 2 

is a normalization factor which makes Q orthonormal. 
The matrix R is indeed upper right triangular, with in 
fact only zeroes above its diagonal. Thus the worst case 
decoding complexity of such a code is the size of the 
QAM alphabet, that is, of linear order. 

Finding basis matrices with similar properties as those 
of the Alamouti code seems a difficult task. The question 
is in general to find families of matrices {Bi, ... , Bk} 
which are orthogonal in the sense that (a(Bi), a(Bj)} = 
0, i 7^ j, and will keep this property even after multi- 
plication by an arbitrary channel matrix H. Let us start 
modestly and wonder whether we could find such a pair 
of matrices B,B' G M n (C) whose orthogonality will 
resist a channel matrix H G Mfc Xn (C), where n > k. 
Using (13), we need to check that 

= (a(HB), a(HB')) = ^(Tv(HB(HB') r )). 



As a first example, take 




B 







and B' 





x 2 







where x\, x 2 G C. These two matrices clearly satisfy the 
orthogonality relation (a(B), a(B')) = 0. Now pick an 
arbitrary complex matrix 



H 



'hi h 2 
hz h 4/ 

A direct calculation shows that 

Tt(HB(HB')^) 
= x\hix 2 h* 2 — h 2 x 2 h\x\ + xih 3 x 2 h 4 — x 2 h 4 h 3 xl 
= i^S(xih 4 x 2 h 2 ) + &(x\h 3 x* 2 h* 4 ) 

so that 

^t.(Tr(HB(HB') jf )) = 0, 



independently of the matrix H. 
As a second example, consider 





B 

















£3 








X3*/ 



8 





-X 2 * 





\ 


X2 











o 


o 







v 


o 


X4. 


j 


(hi 




h 3 


hA 


" Us 




h 7 


h 8 ■ 



We can similarly see that K(Tr(FS(F J B / ) t )) = 0. 

The notable thing however is that both examples are 
closely related to the Alamouti code (the first example 
being really included in it). This is not a surprise, since 
most of the work available on fast ML decodability tries 
to actually exploit the code structure. To pursue our 
investigation on fast decodability, we now need to focus 
on algebraic constructions of space-time lattice codes 
from division algebras. 

IV. Space-time codes from division algebras 
A. Background 

Since the work of Sethuraman et al. [19], a standard 
algebraic technique to build space-time block codes 
is to use cyclic division algebras over number fields 
(that is, finite extensions of the field Q). For the sake 
of completeness, we will start by recalling the formal 
definition of a cyclic algebra, after which we will provide 
an illustrative example, rather than redo the whole theory, 
which the reader can find in [19], or in the tutorial [22]. 

Definition 4.1: Let K be an algebraic number field 
and assume that E/K is a cyclic Galois extension of 
degree n with Galois group G&\(E/K) = (a). We can 
now define an associative i^T-algebra 

A = (E/K, cr, 7) = EeuE® u 2 E • • • u n_1 £, 

where u G A is an auxiliary generating element subject 
to the relations xu = ua(x) for all x G E and u n = 7 G 
K*, where K* denotes K without the zero element. 

The element 7 is often called a non-norm element due 
to its relation to the invertibility of the elements of A. 
Namely, if there exists no element x G E such that its 
norm would be Af E /k(x) = 7*, where t G Z + is a proper 
divisor of n, then A will be a division algebra [23, Prop. 
2.4.5]. This result is a straightforward simplification of 
a theorem by Albert [24]. 

Space-time codewords are obtained by considering 
matrices of left multiplication by an element of A in 
the above basis. 

Let us see how the coding is done more concretely 
through an example. We first need a number field E of 
degree n whose Galois group is cyclic. For example, take 



C5 = e 2i7r//5 a primitive 5th root of unity, and consider 
the number field E = Q(i, £5) over K = Q(i), given by 

Q(i, Cs) = {x = a + 6C5 + cC| + d(l a, b,c,d£ Q(i)}. 

It is of degree 4 (i.e., of dimension 4 as a vector space) 
over Q(i). Let us assume that we want to encode QAM 
symbols. Since they can be seen as elements in Z[z] C 
Q(i), we have that one element x in Q(i, £5) encodes 4 
QAM symbols, namely a, b, c, d, as linear combinations 
in the given basis. The Galois group of Q(i, Cs)/Q(*) 
describes maps that permute £5 and its conjugates £5, 
j = 2,3,4 while fixing Q(i). If a(( 5 ) = <f, we have 
that 

^ 2 (c 5 ) = c!, ^ 3 (c 5 ) = cl <r 4 (c 5 ) = c 5 

yielding a cyclic Galois group. We now build an asso- 
ciative algebra A based on E. As a vector space, A can 
be seen as a sum of n copies of the chosen number field 
E of degree n. In our example, this gives 

A = Q(i, Cs) © uQ(i, Cs) e u 2 Q(i, C 5 ) © u 3 Q(i, ( 5 ) 

where {1, u, u 2 , u 3 } forms a basis and 7 = u 4 must 
be an element of the base field Q(i), say u 4 = i. A 
space-time block code can be obtained by considering 
the matrix of left multiplication in this given basis. If 

X = Xo + UX\ + U 2 X2 + U 3 X3 G A, Xo,Xi,X2,Xz G 

Q(i,C5)> then its corresponding multiplication matrix is 

/ x ia(x 3 ) ia 2 (x 2 ) ia 3 (xi) \ 

x = xi a(x ) ia 2 (x 3 ) ia 3 (x 2 ) 

x 2 cr(zi) o- 2 (x ) ia 3 (x 3 ) 

\ x 3 a(x 2 ) cf 2 {x 1 ) a 3 (x ) j 

where the factor i comes from u 4 = i and cr J , j = 
1, 2, 3, 4, are the elements of the Galois group, appearing 
due to the non-commutative multiplication defined on A 
by xu = ua(x) for x G E. 

Let C be the codebook formed by codewords X of 
the above form. For it to be fully diverse, recall from 
(3) that it is enough to have 

det(X' - X") / 

for X' 7^ X" in C, or equivalently, by linearity since we 
are considering space-time lattice codes 

det(X) / 

for X / in C. This can be obtained by asking for A 
to be a division algebra, property that depends on the 
choice of the value of 7 (or 7 = i in our example). If 
there exists no element a G Q(i, Cs) such that its norm 
is i or i 2 , i.e., A^^/Q^a) = i, or -1, then A will 
be a division algebra [24], [23]. 
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Let us check that A is indeed a division algebra. Note 
for this purpose that Q(( 5 + C5 1 ) = Q(VS) is a subfield 
°f Q(Cs)- Suppose now that there exists an element a € 
E such that J^QUx 5 )/Q(i)( a ) = then, by transitivity of 
the norm 

A/ Q(i,f5)/Q(i)( a ) =A/ Q(i,v / 5)/Q« A/ Q(i,C5)/Q(i,v / 5)( a ) = *> 
which implies the existence of an element b = 
^Q(i,C 5 )Mi,VE)( a ) such ^ 

■^Q(i,V5)/Q(i)( b ) = Z ' 

a contradiction [25]. 

The case of a norm of — 1 is tougher though. However, 
there are several ways to deal with it. We refer the reader 
to [16, Section 8], where the proof used for the algebra 
D4 can be used here verbatim. 

We have thus constructed in our example a fully- 
diverse (4 x 4) space-time code matrix. It furthermore has 
the non-vanishing determinant property (see Definition 
2.6), since the information symbols are restricted to alge- 
braic integers in L, and hence the minimum determinant 
belongs to yielding min^o I det(X)| = 1 (cf. 

[16]). 

We conclude with two important invariants of central 
simple algebras. Central simple if-algebras are algebras 
whose center is K and which have only trivial two-sided 
ideals. Cyclic algebras are particular cases of central 
simple algebras. We could have stated these definitions 
only for cyclic algebras, but for the rest of this work, we 
will need them in more generality. 

Definition 4.2: Let A be a central simple If -algebra. 
The degree of A is the integer &eg(A) = y/d\m.K{A). 

Wedderburn's theorem is a major theorem in the 
theory of central simple algebras, which tells that every 
central simple algebra (and thus in particular every cyclic 
algebra) is isomorphic to a matrix algebra over a central 
division If -algebra V. 

Definition 4.3: The index of A is the integer 
ind(„4) = deg(D) where V is the unique central division 
fT-algebra associated to A by Wedderburn's theorem. 

We have that ind(„4) | deg(-4) and equality holds if 
and only if A is a division algebra. 

B. Examples 

Let us now consider a few well known examples of 
division algebra codes, and see how they behave with 
respect to fast decodability. 

The Alamouti code [6] can be seen from an algebraic 
perspective as a cyclic division algebra 

V Alam = (Q(i)/q,a,-l), (15) 



where o is the complex conjugation. This is a Q-central 
division algebra of index 2, whose cyclic representation 
indeed yields codewords of the type 

/ xi -x%\ 
\x 2 x\ ) ' 

where Xi are in Z[i] (that is, they are QAM symbols). 

This algebra is more commonly known as the Hamil- 
tonian quaternions 

H = {a + ib + jc + ijd \ a,b,c,d G M.}, 

where i 2 = j 2 = — 1, ij = —ji. 

Probably the most important property of this code 
is that, when used over a MISO channel, its worst 
case decoding complexity is linear, as was shown in 
Subsection III-B. 

Let us now consider the division algebra 

2? ort = (Q(i,v^)/Q(v^),<7,-l) (16) 

from [7]. This is an index 2 algebra with center Q(\/2). 
It can be turned into a space-time code by mapping the 
element x = a\ + 02 Cs + u o-3 + ^Cs04 £ T^ort to a 
codeword X given by 

/ a 1 + a 2 ( 8 -a* 3 -alQ \ 

03 + 04(8 a\ + 

ai - a 2 (8 ~a* 3 + a* 4 Q 

\ a 3 - a 4 Cs a\ - a* 2 Q J 

where aj = g 2 j-i + «2j £ Z[i], j = 1,2,3,4. We can 
now write this in the form 

8 

3=1 

where g = (gi, . . . , g$) is the PAM symbol vector, and 
the basis matrices are 

B 1 = diag(l, 1, 1, 1), B 3 = diag(Cs, Cs*, "Cs, -Cs*), 

B 2 = diag(z, -i, i, -i), B A = diag« 8 , -iQ, -i( 8 , < 8 ), 



B 6 = 



(0 


-1 


\ 






f° 


-a \ 


1 







-1 


,B 7 




Cs 




a 


V 


1 










-Cs 0/ 


(0 








(° 




\ 


i 





,B 8 = 







-iQ 




i 






V 


i O) 










-Kb / 



The decoding complexity of this code for a MISO 
channel is 2|S'| 4 instead of the maximal complexity 



10 



MISO code 


matrix 


center 


index 


(real) 


max \S\ K 


A 2 


(2 x 2) 
(4 x 4) 
(2 x 2) 


Q 

Q(\/2) 

Q 


2 
2 
2 


\s\ 

\s\ 4 
\s\ 4 


\S\* 

\s\« 
\s\ 4 



TABLE I 

Code constructions: algebraic properties versus 
decoding complexity 



|5| 8 . Indeed, write the channel H = (^1,^2,^3,^4) as 
(Hi,H 2 ) with Hi = (hi,h 2 ) and H 2 = (/13, /14), so that 



#-Bj — (H\,H 2 ) 



B}' 1 



>2,2 



B\ 

whence %t(Ti (HBi (HBj ) t ) ) simplifies to 

RClir^iB^CBj'^tffJ) + Tr(H 2 B*' 2 (BfyHl)). 

The basis matrices are closely related to those of the 
Alamouti code given in Subsection III-B, and it is easy, 
using the known orthogonality relations of the Alamouti 
basis matrices, to see that 

X{Tr{HBi{HBj)t)) = 0, i = 1,2,3,4, j = 5,6,7,8, 

yielding an upper triangular matrix i? of the same form 
as in (12), and consequently a decoding complexity of 

2\S\ 4 . 

Our final example is the division algebra 

■A 2 = (Q(V3)/Q,a,-1), 

where cr(\/3) = — \/3. This algebra is of index 2 with 
center Q, and yields codewords of the form 

xi + x 2 \/3 — xs + X4\/3\ 

X 3 + X4\/3 Xi — X2\/3 / ' 

where Xi G Z. However, as far as we know there is no 
existing method to reduce the decoding complexity of 
this code. 

We already observed in Subsection III-B that from the 
decoding perspective, it might be beneficial for codes 
to inherit some of the special structure of the Alamouti 
code. This study of different algebraic code structures 
seems to concur with the same conclusion, expressed 
now in algebraic terms as: a code should be a subset 
of Mfc(H) for some k. However, which algebras exactly 
give fast decodability still seems unclear (see Table I). 
In the following section, we are going to answer this 
question. 



{H 1 Bl' 1 ,H 2 B^' 2 ), 



V. Embedding codes into matrix rings of the 
Hamiltonian quaternions 

We have so far discussed fast decodability of space- 
time codes via sphere decoding, and through several 
heuristic examples concluded that codewords in rings 
Mfc(H), for some k and H the Hamiltonian quaternions, 
are prone to offer orthogonality relations that induce 
fast sphere decoding. Therefore our main interest is now 
to study space-time codes that are subsets of the rings 
Mfc(H). This will be characterized by the ramification 
of the cyclic algebra over which the space-time code is 
built. 



A. Embedding division algebras into Mfc(H) 

Let K/Q be an algebraic extension of degree m. We 



then have that 



m 



n + 2r 2 , 



where r\ is the number of real embeddings and r 2 the 
number of pairs of complex embeddings of K. We call 
these embeddings the infinite primes of the field K and 
the non-zero prime ideals of the ring Ok definite primes 
of the field K. If the embedding is complex, resp. real, 
we call it a complex resp. real prime. To each prime P, 
finite or infinite, corresponds a local field Kp, obtained 
by completion of K with respect to the absolute value 
induced by P (the same way K is obtained from Q by 
completion with respect to the usual absolute value). 

Let A be a central division K-algebra of index and 
thus degree n. Consider 

Ap = A®k K P 

a central simple Kp-algebra, which is known to be 
isomorphic to M r (V) for some r and some central 
division Kp-algebra V. We denote by mp the index of 
A p and call it the local index of A at P. We say that P 
is ramified in A if mp > 1 

Let us define the space G(C) n C M nX 2 n (C) by 

C7(C) n = {(B*,B) G M nx2n (C) \B G M n (C)} 

and B* = (6*-). Now >4.®qR is a semi-simple Q-algebra, 
and can thus be written as a Cartesian product of simple 
subalgebras. Its center is K(8iq]R, which is isomorphic to 
copies of R or C: a copy of M for each real embedding 
of K, and one of C for each pair of conjugate complex 
embeddings. The simple components of „4(X>qIR will thus 
have these factors as centers, and will be either central 
simple algebras over R or C: those over C will be matrix 
algebras over C, while those over R will be either matrix 
algebras over R if A is not ramified in the corresponding 
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real prime, or matrix algebras over H if A is ramified. 
Formally, we obtain the isomorphism [26] 

A® Q R = M n/2 (H.)» x M n (R) r ^ x G(Cf 2 , (17) 

where uj is the number of real places where A ram- 
ifies. Therefore each element in A can be seen as a 
concatenation of uj matrices in M n (C), r\ — uj matrices 
in M n (R) and r 2 pairs of conjugate matrices in M n (C), 
or alternatively as a matrix in M nxnm (C), recalling that 
m = r\ + 2r 2 . 

The above isomorphism (17) implies an injection ip 

A diag(M n/2 (H) w x M n (R) ri " w x G(Cp), (18) 

where the diag-operator places the ith (n x n) block to 
the ith diagonal block of a matrix in M mn (C). From 
(18), we now see that it is possible to embed a division 
algebra A into M^(H) if and only if 

^:^diag(M n/2 (H) m ), (19) 

namely we must have r 2 = and n — uj = 0. In 
summary, we have that 

Corollary 5.1: In order to be able to embed a division 
JT-algebra A into M n/2 (H): 

• The center K cannot have complex places, that is, 
it must be totally real (r\ = m). 

• Combined with the equation n — uj = 0, we then 
have that uj = m, so that all the infinite places of 
K must be ramified in A. 

Let us then suppose that K is indeed a totally real 
number field. We shall now give a simple family of cyclic 
K-algebras that fulfill the second condition above. 

Proposition 5.2: Let A = (E/K,a,j) be a cyclic 
division algebra, where E is a CM-field {i.e., E is a 
totally complex field containing a totally real field E\ 
such that [E : E x ] = 2). Let r/i,...,r/ m be the Q- 
embeddings of K. If 77^(7) is negative for any rji, then 
all the infinite places of A are ramified. 

Proof: Let us suppose that Pj is one of the infinite 
primes in the field K and that rji is the corresponding 
Q-embedding. Let k be the smallest possible positive 
power such that a k fixes the totally real subfield E\ of 
E. We then have [27, Theorem 30.8] 

(E/K,a, -7) ®q Kp. ~ (EK Pi /K Pi ,a k , -77,(7)), 

(20) 

where ~ refers to equivalence in the Brauer group 
B(Kp.). Because Pi is a real prime, we can identify 
Kp. and R, and similarly, EKp z and C, so that from 
(20), we get {a k ) = Gal(C/K). Finally, 

(E/K,*,-7) ® Q K Pi - (C/R,<7*,-^(7)), 



where a* is the complex conjugation and —77^(7) is 
a negative real number. The claim now follows as 
(C/R,<7*,-»fc(7))^H. ■ 
We point out that for rational numbers r we have 
f]i{r) = r. Therefore a negative rational number is 
always a suitable non-norm element if A is a division 
algebra. 

Example 5.1: The algebras P or t and V^iam discussed 
above both fulfill the conditions of Proposition 5.2. 
Therefore V^i am can be emebdded into Mi(H) = H 
and V ort into M 2 (H). 

B. Embedding space-time lattice codes into Mfc(H) 

We have given in Corollary 5.1 the conditions for 
a division algebra A of index n to be embedded into 
M n / 2 (H). To obtain a space-time lattice code, we need 
to select a discrete subset of A, namely one of its orders. 
We denote by Ok the ring of integers of K, and similarly 
by Oe the ring of integers of E. 

Definition 5.1: An O^-order A in A is a subring of 
A, having the same identity element as A, and such that 
A is a finitely generated module over Ok and generates 
A as a linear space over K. 

This choice is motivated by the following example: 

Example 5.2: Let E/Kbe, & cyclic extension of alge- 
braic number fields and (E/K, a, 7) be a cyclic division 
algebra, with 7 G K* an algebraic integer. The Ok- 
module 

a = o E e uO E e • • • e u n ~ x o E 

is a subring of the cyclic algebra (E/K,a,j). We refer 
to this ring as the natural order [7]. Most space-time 
lattice codes built from division algebras [19], [9] have 
been further restricted to this natural order. 

In theoretical considerations we will later mostly con- 
sider Ox-orders (where K is the center) but the con- 
nection to coding theory is more visible if we consider 
Ox-orders as Z-modules. 

Definition 5.2: A Z-order A in A is a subring of A, 
having the same identity element as A, and such that A 
is a finitely generated module over Z and generates A 
as a linear space over Q. 

The ring Z is a principal ideal domain and therefore 
a Z-order is not only finitely generated as a Z-module, 
but it also has a Z-basis. This basis is also a Q-basis for 
the algebra A. In particular a Z-basis of an order in A 
has diniQ(^l) elements. 

Remark 5.1: The ring Ok is a finitely generated Z- 
module. It is also known that K is generated as a linear 
space over Q. These results reveal that any Ox -order is 
also a Z-order. 
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Let us again consider a general division algebra A 
having a center K, where [K : Q] = m, and let ip be 
the embedding of A defined in (18). 

Proposition 5.3: Let A be a Z-order of A. Then ip(A) 
is a mn 2 dimensional lattice in M mn (C). If 

{ai, . . . , a mn 2 } 
is a Z-basis of the order A, then 

{^(ai), . . . ,V>(a?nn 2 )} 

is a Z-basis of the lattice ip(A). 

For any non-zero element of the order A, we have 

det min (V>(A)) > 1. 

In particular ^>(A) i s a space-time lattice code that has 
the NVD property (see Definition 2.6) and dimension 
rate mn 2 /mn = n. 

Proof: The Z-basis of A has diniQ(„4) elements. We 
have that A is of index n and thus degree n, so it is of 
dimension n 2 over the center K. The center K on the 
other hand is an m-dimensional Q- vector space. Overall 
we get that dimQ(.A) = mn 2 . Let us now consider 
a Z-basis {a\, . . . ,a mn 2} of A. While it is clear that 
the set {ip(ai), . . . ,ip(a mn 2)} does generate ^>(A), it is 
not directly obvious that ip(a±), . . . , i/i(a ran 2) are linearly 
independent over R. For this result and for the claim on 
det m j n (ip(A)), we refer the reader to [26]. 

According to Definition 2.2, the dimension rate R\ for 
the code ip(A) is given by 

dim R (V>(A)) mn 2 

tx\ = = = n 

nm nm 

dimensions per channel use. ■ 
Remark 5.2: Due to the above connection between an 

order and a lattice, we may equally call a lattice code 

an order code. 

If we now concentrate on codes that are embeddable 

into Mfc(H), we need to restrict to a i^T-central division 

algebra A of index n, where K is totally real and all 

the infinite places are ramified. We then get from (19) 

an embedding 

V : A diag(M n/2 (H) m ) C diag(M n (C) m ). 
By taking an order A C A, we get a lattice code 

V>(A) = ZA 1 • • • ZA mn 2 c M nm (C), 

where Ai G M nm / 2 (H), i = l,...mn 2 , forms a Z- 
basis of the lattice. Its dimension rate is similarly n. It 
is clear that forcing a space-time code to be embedded 
in M n / 2 (H) imposes an extra constraint. The next result 
characterizes this constraint in terms of the dimension 
rate. 



Proposition 5.4: Let us suppose that we have a lattice 
space-time code C C Mfc(C) n M fc / 2 (H), where k is 
even. We then have that 

dim M (C) < k 2 . 

Consequently, the dimension rate R\ of C as given in 
Definition 2.2 is at most k. 

Proof: We can see that, as a subspace in M 2 (C), 
the ring of Hamiltonian quaternions has degree 4. Each 
matrix in M fc / 2 (H) consist of (k/2) 2 freely chosen (2 x 
2) blocks that have the inner structure of Hamiltonian 
quaternions. Therefore we have 

dim R (M fc/2 (H)) = 4^ =k 2 . 

m 

If we compare the rate n of ip(A) with this result, we 
get n versus nm, where m = [K : Q]. There is thus 
a trade-off between fast decodability and rate. However, 
by choosing the center of the algebra A to be Q, we can 
meet the optimal dimension rate of Proposition 5.4. 

Remark 5.3: We warn the reader here. The theory 
developed so far is not explicit in a sense that while it 
does give us a good description of how to construct the 
needed division algebras (see Proposition 5.2), we have 
not given an explicit method to produce the embedding 
(18). In particular, we have no guarantee that the left 
regular representation would have anything to do with 
the embedding (18). In Section VII and the following 
parts of the paper, we will show that there are methods 
to overcome this problem and that the left regular 
representation can work as a good starting point. 

VI. Bounds and existence results for matrix 

LATTICES IN M fc (H) 

So far, we have given conditions for a division central 
K-algebra A to be embedded into M&(H) and shown 
how to obtain fast-decodable space-time lattice codes 
from orders of A. In this section we are going to 
give bounds and existence results for such codes, taking 
into account an extra code design criterion, namely the 
normalized minimum determinant of a lattice code. 

A. Normalized minimum determinant of an order code 

The minimum determinant det m j n (C) is a widely used 
concept to predict the performance of a finite space-time 
code C, since it determines its coding gain. In order to 
compare two finite space-time codes Ci,C 2 G M n (C), 
one must first check that 

• both codebooks have equal number of elements: 
\d\ = \C 2 \ and 
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• both codes are scaled so that the maximum 
power used is equal: max{| \A\ \ 2 F \ A £ CJ = 
max{\\B\\ 2 F \B eC 2 }. 

In the case of infinite lattice codes, due to the discrete- 
ness of the set, a non-zero minimum determinant auto- 
matically yields the NVD property. Among two NVD 
codes using the same maximum power, the one with 
higher minimum determinant will have better coding 
gain for the infinite lattice, and will thus provide us with 
a bound on the coding gain of any finite constellation 
carved from it. Now given an infinite space-time lattice 
code C, a number R of codewords, and a fixed power 
constraint, there are different ways to pick a finite 
constellation that may lead to different coding gains. 

The two most typical encoding methods are linear dis- 
persion encoding (cf. the discussion underneath Equation 
(2)) and spherical encoding. These encoding methods 
usually result in different constellation shaping, that can 
be either cubic (more generally orthogonal) shaping, pro- 
vided the lattice is orthogonal to start with, or spherical 
shaping. The two possible shapes are described below in 
more detail. 

Spherical shaping. Just as for Gaussian channels, the 
most energy efficient way to choose codewords from a 
given lattice is to use spherical shaping. This means 
that we choose the needed number of lowest energy 
codewords from the space-time lattice code C and then 
scale the finite code C(r) given by 

C(r) = {A\A € C, \\A\\ F <r}cC (21) 

to meet the power constraint, where r depends on the 
number R of wanted codewords. For large code sizes, 
this approach will roughly give lattice points inside a K- 
sphere, where K is the rank of the code lattice (=number 
of dispersion matrices). 

To fairly compare two finite codes C\(r) and £2(7) > 
one should first scale them so that both the lattices 
have a fundamental parallelotope of volume 1. Since we 
consider a space-time lattice code C € M n (C), to define 
its volume we first map it to IR 2ri via a, yielding the 
lattice a(C) whose basis is {a(Bi), . . . , a(Bx)}, ob- 
tained from the basis {Bi, . . . , Bk} of C. The generator 
matrix M of a(C) is M = (a(5i), . . . , a(B K )), where 
a(Bi) are column vectors, and we define the measure 
(or volume) m(C) of the fundamental parallelotope of 
the space-time lattice C by 

m(C) 2 = det(MM T ) = det( ( KTr^sJ)) 

V J J l<i,j<K 

To combine the notion of minimum determinant with 
that of scaling the volume of the lattice to evaluate the 
performance of finite constellations, we use the notion 



of normalized minimum determinant 5(C), obtained by 
first scaling the lattice C to have a unit size fundamental 
parallelotope and then taking the minimum determinant 
of the resulting scaled lattice. A simple computation 
proves the following. 

Lemma 6.1: Let C be a A'-dimensional space-time 
lattice in M n (C). We then have that 

<5(C)=det mm {C)/{m{C)) n ' K . 

The normalized minimum determinant predicts which 
lattice is likely to produce the finite codes with the 
biggest minimum determinants, while using spherical 
shaping. 

Cubic shaping. We also consider another kind of 
shaping, called cubic or orthogonal shaping. 

Definition 6.1: We say that a space-time lattice C in 
M n (C) is orthogonal or rectangular if the corresponding 
real lattice a(C) has a basis that is orthogonal according 
to the normal inner product of the space M. 2n . If each 
of of the basis vectors are of equal length, we say that 
C is orfhonormal. 

When the lattice is orthogonal, there is no point of 
employing spherical shaping (21), for we get the same 
result by using simple linear dispersion encoding (see 
the remark in the end of this section) as described after 
Equation (2). 

One can get bounds for the normalized minimum 
determinant also in the case of cubic shaping, as for 
example: 

Proposition 6.2: [28] Let us suppose that C is an 
orthogonally shaped 16-dimensional space-time lattice 
code in M 4 (C). We then have that 

5(C) < — = 0.0625. 
v ; ~ 16 

In the particular case where C is an order code, that 
is C = Y>(A), with A an order of an index n division 
algebra A = (E/K,a,j) and [K : Q] = m, we know 
from Proposition 5.3 that tp(A) is an mn 2 -dimensional 
lattice in M mn (C) with det m j n (Y>(A)) = 1, so that 

5(m)) = VMc)) 1/re 

and the normalized minimum determinant only depends 
on the volume of the fundamental parallelotope of the 
order code. 

Remark 6.1: Note that the fact whether one uses 
linear dispersion encoding (i.e., a symmetric coefficient 
set) or spherical shaping (i.e., an optimized coefficient 
set) has nothing to do with the shape of the original 
lattice. Even though the lattice is not orthogonal, we can 
employ both encoding methods. If the lattice is not badly 
skewed, then the difference between the two methods is 
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usually not very big, whereas for highly skewed lattices 
one may see a gap of several dBs. 

For orthogonal lattices, both methods will give the 
same result, provided that the target constellation size is 
suitable for a symmetric coefficient set to start with. 

B. Bounds and existence results 

Since the normalized minimum determinant of an or- 
der code only depends on the volume of its fundamental 
parallelotope, one may wonder whether, given a center 
K, it is possible to find the smallest volume an order 
inside any division algebra of a given index n can have. 

To answer this question, we first further characterize 
the volume of the order by connecting it to an invariant 
of the order. 

Proposition 6.3: [26] Let A be a Z-order in A and let 
ip be the embedding (18). We then have that 



m(V(A)) = V|d(A/Z)|, 

where d(A/Z) is the Z-discriminant of the order A (see 
[27], [16] for an exact definition), and further that 

, , \ l/2n 

Clearly the smaller the absolute value of the Z- 
discriminant of an order is, the greater the normalized 
minimum determinant will be. 

Inside a given algebra the Z-orders having the smallest 
possible discriminant are called maximal orders. All the 
maximal orders of a given division algebra share the 
same discriminant. 

While each O^-order is also Z-order, the opposite 
does not have to be true. However if a Z-order A 
also is an O^-module, it is an O^-order and its Ok- 
discriminant d(A/C/^) is related to its Z-discriminant 
by the following transitivity formula: 

Lemma 6.4: Let A be a if-central division algebra of 
index n and let A be an Ok -order. If A is a Z-order in 
A, then 

d(A/Z) =N K/q {d{K/0 K ))d{0 K /Z) n \ 

where cZ(C^/Z) is just the usual number field discrim- 
inant of the extension K/Q. 

To summarize, we have just shown that the normalized 
determinant 



*(V(A)) = l/(m(C)) 



l/n 



is given by 

WA)) = 



This reveals that we only have to consider the term 

Af K/Q (d(A/0 K )) 

as d{0 K /Z) n2 is fixed (when K is fixed). The 
0if-discriminant d(A/0x) is an ideal in Ok, but 
NK/q(d(k/OK)) can be seen as an element in Z. 
Therefore we can discuss the size of ideals of Ok- By 
this, we mean that ideals are ordered by the absolute 
values of their norms to Q. For example, if Ok = Z[i], 
we say that the prime ideal generated by 2 + i is smaller 
than the prime ideal generated by 3, because they have 
norms 5 and 9, respectively. 

We are now ready to state the bounds that characterize 
the best order codes in terms of normalized minimum 
determinant. The hypotheses take into account that the 
order code can be embedded into M^.(H), for some k. 

In the following, we use the notation 2 || n which 
means that 2 divides n, but 4 does not. 

Proposition 6.5: Let A be a If -central division alge- 
bra of index n, 2 | n, where if is a totally real number 
field, and let Pi < P2 be a pair of smallest primes in K. 
Let us suppose that all the infinite primes are ramified 
in A. 

If 2 || n and 2 | [K : Q], then the minimum 
discriminant of A is 

If 4 I n then the minimum discriminant of A is 

If 2 || rx and 2 { [K : Q], then the minimal discriminant 
of A is 

p n(n-l) pfc(fc-l) 

Proof: The proof with related background as well 
as more general bounds can be found in Appendix. ■ 
Example 6.1: Consider the question of building a 16- 
dimensional lattice code in M^C) with the best achiev- 
able normalized minimum determinant. The order code 
V'(A) gives an mn 2 -dimensional lattice code in M nm (C) 
for any order A. To have nm = 4 and mn 2 = 16, the 
only option is to choose m = 1 and n = 4. According 
to Proposition 12.3, we have that the smallest possible 
discriminant for a Q-central division algebra of index 4 
is 2 12 • 3 12 . Let us now suppose that 

A=(E/®,a,j) 

is the algebra having a maximal order A with the 
promised discriminant. According to Proposition 6.3 we 
have that 



\M K/Q (d(A/0 K ))d(0 K /zy 



l/2n 



m(Y>(A)) = 6 6 and 5(V>(A)) 



1 

612 



0.068... 
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Proposition 6.5 tells us that we can achieve this bound 
even with a 16-dimensional lattice in M^(C) n M2(H). 

In [10], the authors managed to build a 16-dimensional 
lattice code IA-MAX in M±{C) having a normalized 
minimum determinant equal to 0.1361.... We however 
conjecture that 0.068.... is the best possible minimum 
determinant for a lattice in M 4 (C) n M 2 (H). 

VII. Explicit construction methods 

So far our study has been mostly theoretical. No 
explicit constructions resulting from the mapping ip (18) 
have yet been given. We have only proved that the 
afore described matrix lattices with NVD exist. Let us 
now suppose that we have a If -central division algebra 
V = (E/K, a, 7), where [K : Q] = m and [E : K] = n. 
There exist m Q-embeddings from K to C. For each 
Pi we can find such an embedding o. L : E ^ C that 
o~i\k = Pi- Let us now suppose that {a\, . . . , a m } is a 
set of representatives of embeddings Pi. 

By using the left maximal representation we get an 
embedding <f> : V ^ M n (E) C M n (C). Let us suppose 
that a is an element of V and A is the corresponding 
matrix 4>(a). We then get a mapping 



»(C) 



(22) 



which is denned by 

a !->■ diag(0i(A), . . . ,<r m (A)). 

We now have the following explicit version of the 
previously defined embedding (18). 

Proposition 7.1: Let us suppose that A is a Z-order in 
V and that ip* is the embedding (22) defined above. Then 
V>*(A) is a mn 2 dimensional lattice in M mnxnm (C). For 
any non-zero element of the order A we have 

det m (i;*(a)) > 1. 

However, in general we might loose the connection 
between the volume of the fundamental parallelotope 
of the order code ip*(A) and the Z-discriminant of A. 
However if we can choose the left regular representation 
and the embeddings a, . . . , a m correctly we have the 
following. Let us suppose that we have such a center 
K and an index n division algebra A that 



M n/2 (Hr x M n 



rx—uj 



x G(C) 



r 2 



Proposition 7.2: Let us suppose that A is a Z-order in 
A and that tp* is the previously defined embedding. If we 
can choose o\ , . . . , a m and a left maximal representation 
6 so that 



we get 



and 



m(^*(A)) = V|d(A/Z)| 



W(A)) 



|d(A/Z)| 



l/2n 



Proof: Under the assumption that the embeddings 
and the maximal representation are chosen as presented 
the proof of these claims is verbatim the same as for 
Proposition 6.3 and can therefore found from [26]. ■ 

Unfortunately in the proof of the following proposition 
we have to use some notions not denned in this paper. 

Proposition 7.3: Let us suppose we have an index 
n Q-central division algebra and let <\> denote the left 
regular representation. If we have such a real matrix M 
that 

M^V)M- X C M n/2 (H), 



then 



S(M(f)(A)M- 



1} (|d(A/Z)|) 



\ l/2n 



Proof: We will give the proof in the case where the 
index is 2. The generalization is obvious and we will 
meet all the needed ideas already in this simplest case. 

Let us suppose that 0(A) has a Z-basis 
{A 1 ,A 2 ,A 3 ,A 4 }. We denote Bi = MAiM~ l and 
set B = {(Bi, . . . , B4}. We can flatten the matrix 
Bi into a 4-tuple L(Bi) by first forming a vector of 
length 4 out of the entries of Ai (e.g. row by row). The 
following identities are now easily seen 



L(Bi)L(Bj) T 



Tr(BiBj) 



(23) 



and 



L(B t )L(Bj Y = Tr{BiBj). (24) 
The Gram matrix of the lattice M0(A)M _1 is 
G = (X(Tr(BiB})))l j=1 . 

Both Bi and Bj do have Alamouti structure and therefore 
so does also B, l b\. This reveals that Tr^st) G R and 
we can omit taking the real part from the Gram matrix. 
According to Equation (23) we can now write 

G = (L(B l )^( J B*) T )t = i = L(B)L(B)\ 

where the rows of the 4x4 matrix L(B) consist of the 
vectors L{Bi). A simple permutation of the columns and 
elementary properties of determinants give us that 

|det(L(S))det(L(^) t )| = 



^(A) cdiag(M n/2 (Hr xM r , 



x G(Cp), \det(L(B))det(L(BY )| = |det(L(B))det(L(B') 



'\T\ 
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where L(B') is a matrix with the rows L((Bi) T ). Ac- 
cording to Equation (24) we now have 

L(B)L(B') T = (TriMAAjM- 1 ))^. 

A general result on matrix traces tells us that 
TriXCX- 1 ) = Tr(C) for any matrices C and X. This 
result combined with the definition of the discriminant 
now gives us that 

L{B)L(B') T = (Tr (M A AjM- 1 ))^ = 
(Tr(^))f J=1 = ^d(A/Z). 

m 

Example 7.1: Consider from (15) the division algebra 

V Alam = {Q(i)/Q,a,-l), 

which has index 2 and center Q. The field Q has only 
one infinite place oo and according to Proposition 5.2 
it is ramified in the algebra T>Aiam- We thus have an 
embedding T>Aiam ^ H given by (19). If we choose 
a Z-order A in V A i am , ip(A) C H C M 2 (C) is a 4- 
dimensional lattice code. 

Here the left regular representation directly gives us 
an explicit version (see (22) and Proposition 7.1) of this 
mapping. As demonstrated in the beginning of the paper, 
it also gives us a fast-decodable code. 

Example 7.2: Let us consider the example we gave in 
the very beginning of the paper. The cyclic algebra 

V ort = (Q(i,v^)/Q(V2),<7,-l), 

is an index 2 division algebra with center Q(a/2). Here 
a is simply the complex conjugation. The general theory 
tells us that V or t can be embedded into M2(H). 

Again the mapping from Proposition 22 will directly 
give us an explicit version of the embedding in (19). 
The field Q(\/2) has two Q-embeddings fi\, fa , where 
/3i(\/2) = y/2 and /3 2 (\/2) = -a/2. The corresponding 
Q-embeddings o\ and <7 2 are defined by the equations 
o\ = id, 0"2(i) = i and 02 (a/2) = —a/2 (or equivalently 
02 (Cs) = — Cs)- The natural order A consists of elements 
x = ai + CI2C8 + uo>3 + u(ga,4, where a% € The left 
regular representation now gives us 

= fa, + a 2 C8 -a% - a\<£\ 
\a 3 + a 4 Cs a 1 + a 2 C 8 / 

It is then an easy task to see that 

/ / n (a\ - a 2 C8 ~a\ + a* 4 Q\ 
a2(a(x))= U-«4C 8 at-a* 2 Q)- 

In particular both a{x) and a2{a{x)) are elements in 
H and Proposition 7.2 can be applied. These results 
reveal that the example code we gave in the beginning 



of the paper was just an instance of the general theory 
developed above. 

Remark 7.1: These two examples may give us a little 
too rosy picture of the power of our theory. In both cases, 
the embedding in Proposition 7.1 exactly imitated the 
embedding (19). On top of that this representation also 
led to codes with reduced decoding complexity. How- 
ever, we do not have any guarantee that either of these 
things will happen more generally. It heavily depends 
on the chosen maximal subfield, non-norm element and 
even on the chosen generator of the Galois group. In 
Sections VIII and X we will meet situations where 
the left regular representation does not directly give us 
the required embedding even when the division algebra 
has the correct algebraic structure. Yet, in all these 
cases a simple manipulation applied after the left regular 
representation will give us an embedding to the matrix 
ring of quaternions and codes that have reduced decoding 
complexity. While this may seem to be accidental, there 
are some underlying algebraic principles that explain the 
sudden "luck" we encounter, see Section XI. 

VIII. Fast-decodable 4x2 MIDO codes 

So far, we have developed an algebraic theory of fast- 
decodable codes through different characterizations and 
bounds. We are now finally putting our theory into use 
to give a few different coding strategies that lead to fast- 
decodable codes. We start with MIDO codes for 4 Tx 
antennas, with the following properties: 

• They are 16-dimensional lattices in M^C). 

• They satisfy the NVD property. 

• Their decoding complexity ranges from \S\ 10 to 
IS"! 16 when a real alphabet of size \S\ is used. 

A. A family of fast-decodable MIDO codes with Q as a 
center 

We give here an example of a MIDO code built 
following step by step the theory developed so far. The 
starting point is to consider a division algebra that can 
be embedded into M2CH) via the embedding tp (18). 
According to Section V and Proposition 5.2, we consider 
a Q-central division algebra A = (E/Q,a,j) of index 
4, where E is a CM field and 7 a negative non-norm 
element, namely 

cl) [E:®]=4, 

c2) 7, 7 2 i M E/Q (E*), 

c3) Gal(£/Q) = (a) with a 2 (f) = /*, where /* 

stands for the complex conjugate of /, and 
c4) 7 < 0. 

One instance of such an algebra is 

An«to = (Q(C5)/Q,<T,-8/9), 
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where a is given by aiCs) = Cf- The prime 2 is totally 
inert in the extension Q(Cs)/Q and therefore [16, Lemma 
11.1] V mi d is a division algebra. 

Let Oe = © © © Zu»4 be the ring of 
algebraic integers of E. The left representation <p* of 
V mi do now yields 



/ 2/1 

2/2 
2/3 
V 2/4 



7<>-(z/4) 

o-(yi) 

0"(2/3) 



72/4 
2/1 

2/2 



70- (2/2)" 
70- (2/3) " 
7^(2/4)" 
^(2/i) H 



/ 



(25) 



54i-3^1 + 

for i = 



where = yi(gAi-3,9M-2,94i-3,94i) 

94i-2W 2 + t74j-3^3 + #4^4 and £ 4 j_j 

1, 2, 3, 4, j = 0, 1, 2, 3. If we pick up an order A from 
T^mido, then ip*(A) is a 16-dimensional lattice code with 
the NVD property from Proposition 7.2. 

We can prove that the discriminant of this algebra 
meets the bound of Proposition 6.5, but even if we 
choose a maximal order from this algebra there is no 
guarantee (because we have not fulfilled the conditions 
of Proposition 7.2 yet) that this small discriminant would 
result into good normalized minimum determinant. 

This is because we now face here, for the first time, the 
problem that the embedding tp* from Section VII does 
not directly give us an embedding into M2(H), although 
Proposition 7.1 promises that such an embedding exists. 
Luckily, we can perform a series of simple manipulations 
starting from the left regular representation that will 
transform the code matrices into a correct form and at 
the same time will recover the connection between the 
discriminant of the algebra and the normalized minimum 
determinant of the lattice. 

After swapping 

1) y 2 and y 3 , 

2) the 2nd and the 3rd column, and 

3) the 2nd and the 3rd row, 
we get the matrix 

/ 2/1 72/2 70- (2/4) 

2/2 V\ <T(V3) 

72/4 o-(yi) 

2/3 °{V2) 



2/3 
2/4 



70" (2/3} 
70- (2/4) 
7^(2/2) 



\ 



J 



(26) 



Next we perform the following energy balancing trans- 
formation by distributing the effect of \j\ more evenly. 
By denoting r = I7I 1 / 4 , we finally get a code consisting 
of matrices of the desired type: 



A"fd(2/i,2/2,2/3,2/4) 
/ 



(27) 



2/1 
r 2 y 2 
ry3 
\ r 3 y 3 



2/1 

_ r 3 y * 

ry* 2 



-r 3 o-(y 4 ) 
ra(y 3 ) 

o"(2/i) 
r 2 <r(yi) 



-ra(y 3 )* \ 
-r 3 a(y 4 )* 
-r 2 a(y 2 )* 



The minimum determinant of the code stays un- 
changed since the above transformation is actually just a 
conjugation by a real matrix M. Let us now suppose that 
we have a maximal order A of the algebra V m id (such 
an order can be found by using the computer algebra 
system Magma [29]). Now the new code obtained from 
this maximal order is Mip*(A)M~ 1 , and a direct calcu- 
lation reveals that this code lattice meets the normalized 
minimum determinant bound 5(<j)(A)) = 0.068... (cf. 
Propositions 7.2, 7.3, 6.5, and Example 6.1). 

To make the code suitable for PAM modulation, we 
further describe a modified version of this code that will 
have an almost rectangular shaping. The ring of algebraic 
integers in Q(Cs) also has a Z-basis {1 — C,C ~ C 2 >C 2 — 
C 3 ,C 3 — C 4 }> where we have abbreviated (5 = (. The 
elements in the code matrix (27) now become, after 
further restricting the coefficients g, L to Z: 



y'i 



= y'i(g4i~3, 9^-3,941-2,941) 

= 5 4i- 3 (l- ()+94i-2 (C-C 2 

+ 54l „i(C 2 -C 3 ) + 2/4,(C 3 -C 



+ 

4\ 



and 



<r(Vi) = 54i- 3 (l -C 3 )+2/4i- 2 (C 3 -C) 

+ 54 ,-i(C-C 4 ) + 2/4,(C 4 -C 2 ). 

We get a set of matrices X FL ,, Ai (y' x ,y 2 ,y'^y'^) forming 
a 16-dimensional lattice code in M 2 {S). We note that 
the choice of 7 = —8/9 prevents this order code from 
being a natural order. However, after multiplication by 
9 4 , the resulting lattice code will be included in a natural 
order, thus inheriting the NVD property. The geometric 
structure of the code is relatively close to a Cartesian 
product of four A 4 -lattices (see [30]), therefore we call 
it the A4 code. This code was also proposed for the DVB 
Consortium's Call for Technologies for DVB-NGH [31]. 

The variables ga-j in each of the y[ range over 
a certain PAM set, so that the code encodes overall 
16 independent PAM symbols. In other words, a PAM 
vector (#1, . . . ,gi§) is mapped into a (4 x 4) matrix 

16 



i=i 

where the basis matrices Bi of the code are 

Bi = X FDA4 (y[(l, 0,0,0), 0,0,0), 
B 2 = X FAA4 (yi(0, 1,0,0), 0,0,0), 



Bie = *FD,A 4 (0,0, 0,2/4(0, 0,0,1)). 
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A direct calculation shows that 



&(Tr(HBi(HBj 







for 1 < i < 4 and 5 < j < 8, where H is a (2 x 4) 
channel matrix. This is exactly the design criterion of 
Subsection III-A described by the steps 1-2, yielding a 
complexity of |<S| 12 for the code A4. 

We can perform yet another change of basis that will 
enable us to take advantage of the steps 3-4 described 
in Subsection III-A. The new basis 

c + r 1 c-r 1 c 2 -r 2 ' 

' 2 ' 2 ' 4 

will result in a complexity \S\ 10 , reduced by as much 
as 37.5% from the full complexity |S*| 16 of a general 
MIDO code. However, it is not an integral basis, hence 
the minimum determinant is small though still non- 
vanishing. 

The resulting lattice has almost cubic shaping, but, due 
to the coding gain loss, the performance is approximately 
1 dB worse than that of the A4 version. The promised 
complexity reduction is due to the fact that the first two 
basis elements are real, while the last two are purely 
imaginary. Hence the relations given by the steps 1-4 in 
III-A are all satisfied. 

Remark 8.1: To the best of our knowledge, there is 
no guarantee that an integral basis consisting of n/2 real 
and n/2 purely imaginary elements even exists. 

Remark 8.2: The matrix manipulations given in this 
section may also seem to have a somewhat ad hoc 
feeling. Yet we will see in Sections X and XI that 
this strategy can be used far more generally to give us 
embeddings to Mfc(H). 

Remark 8.3: We also simulated the maximal order 
code from this algebra achieving the discriminant bound 
and the A4 code under spherical shaping. Both codes 
had equally good performance, gaining almost 1 dB 
compared to the linearly dispersed A4. It seems that the 
A4 code did inherit the good performance of the optimal 
maximal order code. 

B. MIDO codes from a bigger center through puncturing 

We now adopt a slightly different approach to the 
design problem of MIDO codes via puncturing of MIMO 
codes. We start from the matrix (14) 



/ x 
x 1 

V £3 



ia(x 3 ) 
a(x ) 
a(xi) 
o-(x 2 ) 



(*2) 



10 

ia 2 (x 3 ) 
a 2 (x ) 



ia 3 (x±) 
ia 3 (x 2 ) 
ia 3 (x 3 ) 



\ 



o- 2 (xi) a 3 (x ) J 



Let us first repeat a remark made above, namely that 
^ 5 + = Q(V5) is a subfield of Q(£ 5 ). As a first 
puncturing, we restrict ourselves to elements in Q(\/5) 
instead of Q(Cs)- Note that since a 2 ^) = £5, we further 
have 

o 2 (& + G 1 ) = $ + G A = G 1 + & 

and thus Q(\/5) is fixed by a 2 . This yields a codebook 
C\ consisting of codewords of the form 



X = 



V5 



( x io(x 3 ) ix 2 ia(x 1 ) \ 

x\ a(xo) ix 3 ia(x 2 ) 

x 2 cr(xi) x ia(x 3 ) 

V x 3 o-(x 2 ) x 1 a(x ) J 



(28) 



It is now enough to notice that we are working in 
the same field extension as for the Golden code [25], 
meaning that we can use the same shaping technique. 
Denote: 

1 + y/E 



a(9) = 
a = 
a(a) = 



2 

1- V5 
2 

1 + i 
1 + i 



1-0, 
iO, 

ia(8). 



Every entry Xj in the above matrix is now taking the 
form 

xj = a(aj + bj6), j = 0, 1, 2, 3, 

where aj,bj 6 Z[i] are chosen to be QAM symbols. We 
thus indeed get a MIDO code carrying 8 complex QAM 
symbols, with unitary encoding matrix yielding the cubic 
shaping property. The factor is used to normalize the 
minimum determinant to one. 

A straightforward calculation gives that the volume of 
the fundamental parallelotope of this code is 5 4 • 2 8 . At 
the same time, the minimum determinant of the code 
is 1. If we now scale the code C 3 with (gi^s) 1 ^ 16 , 
the resulting code lattice C 3 = {-^h^) 1 ^ 16 ■ C 3 has a 
fundamental parallelotope of volume 1. We now see that 
the normalized minimum determinant of the lattice C3 is 



1 



5 4 -2 8 



1/16' 



1 

20' 



and puncture it in two different ways. 



Comparing this to Proposition 6.2, we conclude that the 
normalized minimum determinant of the code C 3 is very 
close to the optimum minimum determinant of orthogo- 
nally shaped MIDO codes. The good performance of this 
code once again suggests that it is favorable for the code 
performance at low SNRs to maintain the cubic shaping. 
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Take again a codeword 

/ x icr(x 3 ) ix 2 ia(x\) 

x\ ct(xq) ix 3 ia(x 2 ) 

x 2 <t(xi) x ia(x 3 ) 

\ x 3 a(x 2 ) x 1 a(x ) 



and multiply both the 3rd and 4th column by C^ 1 , where 
(s = e 2l7r / 8 is a primitive 8th root of unity. Then multiply 
the 3rd and 4th row this time by Cs- Note that this 
of course brings the matrix entries out of the algebra 
we started with, but will do this without changing the 
determinant. We further note that we can use 7 = — i 
instead of 7 = i, since — i is not a norm. The proof 
of this fact is similar to that of the non-norm element 
i (cf. IV-A), and follows from the same argument of 
the transitivity of the norm. We have to show that there 
cannot be an element with norm —i over Q(i, y/5)/Q(i). 
If there were an element a with i)/<Q(i) ( a ) = ~~ *> 

then ia would have norm 

^•^Q(%/5,i)/Q(j)( a ) = 

a contradiction. Again for the case of J\fq^ ( a ) = 
7 2 = —1 we refer the reader to [16, Section 8]. 

We now obtain for the codebook C 3 consisting of 
matrices 



(29) 



Let us denote by ci, c 2 , c 3 and C4 the 4 columns of 
the above matrix. It can be easily seen that the above 
manipulations result in having columns 1 and 3, and 2 
and 4 satisfying 





-ia(x 3 ) 


~Csx 2 


-Cso-(^i) \ 


x 1 


a(x ) 


~Csx 3 


-(8<t(x 2 ) 


Qsx 2 




xo 


-ia(x 3 ) 


\ (8X3 


(8°{X2) 


Xl 


a(x ) J 



T 
C1C3 



0, 



rp 

c 2 C4 = 



without changing the shaping. This construction thus 
increases the "orthogonality-likeness" of the columns of 
the code without altering its other properties. Though 
this transformation does increase the number of zeroes 
in the i?-matrix of the QR decomposition, it does not 
reduce the decoding complexity as defined. This is due to 
the fact that, albeit the above relations resemble the real 
inner product, the vectors Cj actually consist of complex 
elements. 

We now propose another puncturing, which focuses 
this time on having orthonormal columns, in order to 
have provable fast decodability. Since Q(Cs, i) = Q(C2o)> 
where C = C20 = e 2l7r / 20 is a primitive 20th root of 
unity, we can alternatively take as basis for Q(i, £5) the 
set {1, C, C 2 , C 3 }- An element x is then written as 

x = a + b( + < 2 + d( 3 , a, b,c,de Q(i). 



We perform the following puncturing and restriction of 
coefficients. Take xq,x\ of the form 

a + ib( + c( 2 + id( 3 , a,b,c,d£ Z 



so that a 2 (xo) 
take instead 



xo*, <t 2 (xi) = x\*. For x 2 and x 3 , 



a(l + i) + b(l-i)( + c{l+i)( 2 + d(l-i)( 3 , a,b,c,d€ Z 

to get this time <J 2 (x 2 ) = —x 2 *, a 2 {x 3 ) = —x 3 *. This 
results in a codebook C 2 with codewords given by 

/ x ia{x 3 ) -x 2 * ia(xi)* \ 

xi a(x ) -x 3 * -a(x 2 )* 

x 2 a(xi) x * -cr(x 3 )* 

\ x 3 a(x 2 ) xi* a(xo)* ) 

An easy computation shows that the 1st and 3rd row, 
resp. the 2nd and 4th row, are orthonormal. By permuting 
the 2nd and 3rd rows and columns resp., we get 



X 



(30) 



Xc 2 (x ,x 1 ,x 2 ,x 3 ) = 
I xo -x* 2 ia{x 3 ) 



(31) 



ia(xi)* ^ 
x 2 x* Q <t(xi) -<j(x 3 )* 
xi -X3 (t(x ) -<j{x 2 )* 
\ x 3 xl a{x 2 ) <j{xo)* J 

which clearly exhibits the Alamouti block structure of 
the code. 

As previously for the ^4-code, a PAM vector 
(gi, ■ ■ ■ ,9m) is mapped into a (4 x 4) matrix 

16 

1=1 

where the basis matrices Bi are 

Bi = Xc 2 (x (l, 0,0,0), 0,0,0), 
B 2 = X C2 (x o (0, 1,0,0), 0,0,0), 



B w = X C2 (0,0,0,x 3 (0, 0,0,1)). 

Again a direct calculation gives 

?R.(Tr(H Bi(H Bj)^ ) = 
for 1 < i < 4 and 5 < j < 8 and a complexity of I5"! 12 . 

C. The Srinath-Rajan (SR) code 

So far, the best performing fast-decodable 4x2 code 
has been the code based on stacked CIODs proposed 
in [5]. The real and imaginary parts of the encoded 
symbols are separated in a careful way, so that when 
a rotated 4- or 16-QAM alphabet is used, the code 
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has high coding gain. It is moreover conjectured that 
the code has the NVD property, but this has not been 
proved. Before rotating the constellation, the code is 
equivalent to transmitting four independent Alamouti 
blocks A,B,C,D: 



MIDO block error rates at 4 bpcu 



SR unrotated 



A ( 8 B 
CsC D 



where a primitive 8th root of unity £s has been added in 
order to maximize the coding gain of the rotated code. 
Because the blocks are independent prior to rotation, 
the unrotated code does not have full diversity. For this 
reason, getting a proof for the possible NVD by using the 
theory developed in this paper does not seem possible. 

If we ignore the constant £g, the code is exactly of the 
same form as the codes proposed in this paper (except 
possibly for the NVD), as clearly 



A B 
C D 



€ M 2 (H). 



Adding the constant does not affect fast decodability, 
but helps to maximize the coding gain. 

We have not tried whether it is possible to improve 
the coding gain of the codes proposed in this paper by 
using a suitable rotation. This may be seen as a reason 
for the small performance loss of the proposed codes 
compared to the rotated SR code. We did however try 
another type of optimization, namely using a spherical 
constellation instead of linearly dispersed constellation 
(cf. VI- A). The spherically shaped fast-decodable code 
outperforms the SR code (see Section IX below) by a 
fraction of a dB. 

IX. Simulation results of MIDO codes 

In Figure 1, we have plotted the block error rates 
of different MIDO codes at the rate 4 bpcu. All of 
the codes use the 2-PAM or 4-QAM alphabet, ex- 
cept for the spherically shaped A4 code referred to as 
NC (FD, A4, spher.). This code is constructed by using 
a 6-PAM alphabet and then choosing the codewords with 
the smallest Frobenius norms, resulting in a codebook 
with 2 16 codewords. 

We can see that the punctured code C2 
(NC (FZ),punct.)) does not perform too well due 
to its small (though non-vanishing) coding gain. The 
other new codes, for their part, perform more or less 
equally to the Biglieri-Hong-Viterbo (BHV) code. The 
A4 code (ATC (FD, AA)) is slightly beaten by the 
BHV code at low-moderate SNRs, but will eventually 
outperform it starting from 20 dB, thanks to its full 
diversity. The shaped code (NC (shaped)), which is 
not fast-decodable, outperforms the BHV code starting 



NC (FD punct ) 
« NC (FD A4) 
NC (shaped) 
o BHV (FD shaped) 
•• SR (FD shaped) 
* NC (FD A4.spher ) 

IA-MAX 




SNR (dB) 



Fig. 1. Comparison among different MIDO codes at rate 4 bpcu. 



from 16 dB. The Srinath-Rajan (SR) code with a rotated 
4-QAM constellation wins the BHV code by a fraction 
of a dB. The spherically shaped A4 outperforms the 
BHV and SR codes by roughly 0.5 dB, and performs 
slightly better compared to the best previously known 
MIDO code IA-MAX [10]. The code IA-MAX is 
constructed from a certain maximal order, and has 
higher decoding complexity. It is added here for the 
sake of completeness in comparison. 

Let us point out that we have not optimized any of the 
proposed codes by e.g. rotating the constellation. Just out 
of interest, we simulated the unrotated SR code, and the 
performance got somewhat worse than that of the A4 
code. Hence, we also expect some improvement in the 
performance of our codes, when an optimal rotation is 
used. 

We can also use the maximal order of the A4 code 
algebra, which will result in similar performance as 
the IA-MAX and spherically shaped A4 code. While 
the maximal order codes are not fast-decodable, the 
spherically shaped A4 code still uses the same linear 
dispersion matrices and hence admits fast decodability. 
However, an extra step is required to check that the 
decoded word really belongs to the codebook. For a 
detailed description of the required changes in a sphere 
decoder, see [32]. As a conclusion, sticking to linear 
dispersion and natural orders causes a penalty of about 
0.5 dB in the BLER performance. On the other hand, 
it seems that the requirement of fast-decodability itself 
does not cause any performance loss. This is hardly 
a surprise, as the proposed constructions are nothing 
but orders of cyclic division algebras, which have been 
shown to have excellent performance [16], [33], [10]. 
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X. Fast-decodable codes for the 6x3 and 

6x2 CHANNELS 

Let us now extend our code constructions to six 
transmit antennas. While this paper mainly deals with 
MIDO codes, i.e., codes for two receivers, here we also 
consider the case of three receivers. The reason for this 
is that the embedding (18) 

^:^->diag(M n/2 (Hn 

into to a matrix ring of the Hamiltonian quaternions 
naturally yields codes with dimension rate R\ = n, 
which is also the number of Tx antennas. Thus, for six 
transmitters we have R\ = 6, which is ideal for reception 
with three antennas. From this, we can construct a code 
suitable for two receivers (R\ = 4) by puncturing. The 
so-called smart puncturing [34], [10] will be applied in 
order to further reduce the decoding complexity, while 
maintaining a low peak-to-mean power ratio (PAPR). 

A. Construction for the 6 x 3 channel 

We build our (6 x 6) code matrix analogously to the 
(4 x 4) case (cf. Subsection VIII-A). To this end, we 
consider the index-six cyclic algebra 



7 )/Q, CT :C 7 ^C 7 3 ,-3/4) 



A 



built upon the 7th cyclotomic field. Since -3 is inert 
((3 mod 7) generates the whole group Z 7 ), the element 
7 = —3/4 is a non-norm element and A is a division 
algebra. As the center Q is totally real and only has one 
infinite place which is ramified, we have an embededding 
A ^ M 3 (H). 

Let us now build the embedded code matrix more 
explicitly. We start by noting that 

a 3 (x) = x* 

for all x 6 Q(C7)> an d hence, taking into account that 

a(x*) = a(x)*, we get 

a 4 {x) = a(x)*, a 5 (x) = a 2 {x)*. 

We can again start with the left regular representation 
and perform some simple manipulation on the resulting 
matrix: first, we swap the 2nd and the 4th row, and the 
3rd and the 5th row. After this, we swap the 3rd and the 
4th row. Next, we do the same for the columns. Let us 
denote this intermediate form by X'. Then we balance 
the effect of 7 to get a more unified energy distribution 
among the antennas. This can be done by conjugating 
the matrix X' by the matrix 



where r = \/\^\. Finally, we do the exchange i 3 f>X! 
and X4 X2, followed by X2 £3. The final form of 
the code matrix now becomes 



X 



where each 



B 



px'p- 1 = 


( A B 


C 








( Xq 


—rx\ \ 










rx\ 


Xq 






A 






-rx% 










rx 3 


x 2 










X4 


-rx* h 










\ rx 5 


x\ ) 






( 


-r 2 cr(x 5 ) 


—ra{xi)* 


\ 




ra(x4) 


-r 2 a(x 5 )* 








a{x ) 


—ra(xi)* 






ra(x±) 


a(x )* 








<?{X2) 


-ra(x 3 )* 




V 


ra(x 3 ) 


a(x 2 T 


/ 



(32) 



(33) 



(34) 



and 



/ -rV 2 (x 3 ) 



C = 



-ra 2 (x 2 )* \ 
-r 2 a 2 (x 3 )* 

—ra 2 (x 5 )* 
-r 2 a 2 {x A )* 

-ra 2 (xi)* 

a 2 (x y j 



(35) 



1 ( 
' <J" (^27 

-r 2 a 2 (x A ) 
rcr 2 (x 5 ) 
cj 2 (x ) 
V ra 2 ( Xl ) 

consist of three Alamouti blocks. 

The encoding can be performed similarly as in the 
4x2 case. Let us denote the 36 basis matrices by 

B x = £1 (rr (l, 0,0, 0,0,0), 0,0, 0,0,0), 



B 2 = B 2 (xo(0, 1,0, 0,0,0), 0,0, 0,0,0), 
#36 = #36(0, 0, 0, 0, 0, x 5 (0, 0, 0, 0, 0, 1)). 
We then form a finite code by setting 

36 

^6x3 = (Y] 9iBj I gi G G}, 

i=l 

where Q C Z is, for instance, a Q-VKM alphabet. 
B. Decoding 

Let us now consider the sphere decoding process as 
described in III for the code (32). Following the above 
notation, we notice that the code lattice has six basis 
matrices B\,...,B§ of the form 

f xq \ 



P = diag(r, r 2 , r, r 2 ,r, r 2 ), 



a(x ) 



a(x y 



cr 2 (x ) 



v 2 (x y J 
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and six basis matrices B-?, . . . , B\ 2 of the form 



where 



A' 






A \ 


B' 


I) 










—rx\ 


rx\ 






B' = 





ra{x\) 



-ra(xi)" 




and 



c 



-ra 2 (xiY 
ra 2 {xi) 

A straightforward calculation shows that 

^(TriHBiiHBj)^)) = 

for 1 < i < 6, 7 < j < 12 and any channel matrix H. 
Hence, the (36 x 36) i?-matrix of the QR decomposition 
has a (6 x 6) zero block in the corresponding position, 
and the (12 x 12) upper left corner of R looks like 

R 1 ' 1 
R 2 > 2 

where the blocks R % ' % are (6 x 6) matrices. From this 
we see that the symbols gi,...,ge can be decoded 
independently of the symbols 57, . . . , 512, resulting in 
complexity 2|S'| 30 instead of the full complexity |5| 36 . 
Further reductions are possible by a change of basis, 
similarly as in the 4x2 case. By forming the basis 
of elements half of which are real and the other half 
purely imaginary (cf. VIII-A), we get more zeros in the 
R matrix. In that case we again have, for any channel 
matrix H, that 



^{Tr(HBi(HBrf)) 







for 1 < i < 6, 7 < j < 12, but further also get 

%l(Tr(H Bi(H Bj)^)) = 

for 1 < i < 3, 4 < j < 6 and 7 < i < 9, 10 < j < 12, 
resulting in complexity 4|S'| 27 . 



C. Construction for the 6 x 2 channel by puncturing 

In order to construct a 6 x 2 MIDO code, we will 
next consider a punctured version of the above code. 
The puncturing affects the shape of the code lattice, 
so different puncturing will give a different lattice and 
whence also different performance. One obvious option 
is to keep an eye on the Gram matrix of the lattice 
- the closer it is to a (scaled) identity matrix, the 
better the shape. A smart puncturing may also aid the 
decoding process, namely we may puncture the basis 



matrices that cause nonorthogonality. On the other hand, 
it is not a good idea to puncture all six basis matrices 
corresponding to one of the elements Xi in (32), because 
this will cause zeros in the encoding matrix and hence 
increase the PAPR. 

Here we provide just one possible puncturing, to give 
the reader an idea as to how one may go about it. Let us 
denote the basis matrices as in the previous section by 
Bi, ... , B3Q. We puncture the following basis matrices 

in x 2 : Bi 3 , -S14, -B15, 



B20 1 B 



21, 



in x 3 : B 19 , 

in X4 : B25, B26, B27, 

in X5 : B31, B% 2 , 

The resulting code will still have the same orthogonality 
relations as the original code, but will only have 24 basis 
elements giving us decoding complexity 4|S'| 15 . 

XI. Further generalizations via 

CONJUGATIONS OF THE LEFT-REGULAR 
REPRESENTATION 

As already pointed out, we can always embed a 
division algebra into a matrix ring of the Hamiltonian 
quaternions, provided that the center is totally real and 
all of its infinite places ramify. For all such division alge- 
bras, we have that a nt l 2 (x) = x* , a- 
and 7 < 0. The embedding 



j+nt 



/2( 



a J (x 



V> : ^ diag(M n/2 (Hn, 

however, will only give us the existence of a fast- 
decodable code with dimension rate n = n t . 

In what follows, we are going to show how to over- 
come the problem of the implicit nature of the map ip. 
Once we have constructed a CDA A = (E/Q,a,j) of 
the required form, the explicit map tp : A — > M nt / 2 (H) 
is given as follows. 

Proposition 11.1: Let X denote the left regular rep- 
resentation matrix of an element a = xq + ux\ + • • • + 
1 G A. Then 



yUt l x 



V(X) = BPX(BP)- 1 G M nt/2 (H), 

where the elements P(i,j), 1 < i,j < n t , of the 
permutation matrix P are 



P{hj) = { 



1, 
1, 
0. 



if 2 / i and j 
if 2 1 i and j = 
otherwise 



- itl 
2 ■ 

i+n t 



and 



B = diag(V| 7 |,|7l,---,Vl7l,l7l) 
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is the energy balance matrix. 

Proof: Let us first consider the columns of X, and 
denote X = (1, a, . . . , a 11 *" 1 ) to represent the fact that 
the first column is mapped by the identity element, the 
second is mapped by a, etc. In order to get the required 
Alamouti block form, we need to reorder the columns 
as 

(1, W 2 , a 2 , o 2+n </\ . . . , W 2 - V'- 1 ), 

so that cr- 7 is followed by its conjugate for all j. This is 
done exactly by post-multiplying X by P^ 1 . 

Next we have to rearrange the rows. Notice first that, 
by ignoring 7, the rows of XP" 1 look like 

/ 01 b\ ... a nt/2 b* nt/2 \ 
ci di ••• c nt/2 d* (/2 

j s l *1 ••• s »t/2 C t /2 

h a\ ... 6 nt/2 a* t/2 

d l c l ••• d n t /2 < t/2 

\ h sl ... t nt/2 s* nt/2 j 

where the horizontal line divides the matrix in two parts 
each having n</2 rows. We easily see that the Alamouti 
block form can be achieved by pairing the rows as 

(1, m/2 + 1), (2, m/2 + 2),..., {n t /2, n t ). 

This is done by pre-multiplying IP" 1 by P, i.e., we 
conjugate X by P. As the last step, we should take care 
of the effect of 7. By conjugating PXP^ 1 further by 
B = diag( x /py[, (7), . . . , vItI, the elements 7 will 
appear in each (2 x 2) block of the matrix as follows: 

/ (±)vffl (±)| 7 | \ 
V (±)| 7 | (±)y/h\) m 

In addition, the plus and minus signs are automatically 
rearranged by this conjugation so that the resulting 
matrix consists of Alamouti blocks. ■ 
Remark 11.1: After Proposition 11.1, we can alge- 
braically optimize the normalized minimum determinant. 
Namely, the resulting parallelotope will be exactly that 
given by Proposition 7.3. Notice that this was not the 
case before the conjugation, for while the conjugation 
does not affect the non-normalized minimum determi- 
nant, it does affect the measure of the fundamental 
parallelotope and hence the normalized minimum deter- 
minant! 

Now that we have an explicit form of the mapping 
ip, the fast-decodability property can be seen as follows: 
with Q as the center (m = 1), the i?-matrix of the QR 



decomposition of the matrix B (cf. Ill) will consist of 
(n x n) blocks R l 'i , 1 < i,j < n, where 

R l,2 = R 3,4 = . . . = R n-l,n = ^ (3g) 

and the diagonal blocks K l,t , 1 < i < n, are block- 
diagonal: 

\ / nxn 

The zero blocks (36) result from the Alamouti block 
structure and offer us a reduction of n real dimensions. 
The diagonal block structure (37) is due to the fact 
that when we construct the algebra upon a complex 
multiplication field, we can always choose a basis in 
which half of the elements are real and the other half 
purely imaginary. This, for its part, provides us with 
further reduction by ^ dimensions. Hence, the decoding 
complexity will be of order 

where the factor n 2 is the exhaustive search complexity. 

By puncturing, we obtain fast-decodable codes suit- 
able for any number of receivers. The complexity of the 
punctured code is at most 



where R\ < nt is the dimension rate. For n r = 2, 
we get a complexity reduction of 4 " t 4 "^ 5nt = 37.5% 
as promised. However, this may require a non-integral 
basis, and hence cause performance loss compared to an 
integral basis. With an integral basis, we get a reduction 
of 4w 4 t ~ Wt = 25% while guaranteeing a high coding gain. 

In Table II we have summarized the complexities for 
n t = 4, 6, 8 and 2 < n r < ^ as an example. 

TABLE II 

Complexities of the proposed fast-decodable codes. 



n t x n r 


Ri 


n t Ri - 


Comp.reduction/n t i?i 


4x2 


4 


10 


37.50% 


6x3 


6 


27 


25.00% 


6x2 


4 


15 


37.50% 


8x4 


8 


52 


18.75% 


8x3 


6 


36 


25.00% 


8x2 


4 


20 


37.50% 



XII. Conclusions 

In this paper, fast-decodable asymmetric lattice space- 
time codes were studied, proposing one possible gener- 
alization of the Alamouti code and the quasi-orthogonal 
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codes to any even number of transmit antennas n t and for 
any dimension rate R\ < n t . The codes allow linear ML 
processing with e.g. a sphere decoder for any number 
of receivers > Ri/2, but with lower dimensionality 
(less variables per linear equation). It was explicitly 
shown how such novel constructions follow from general 
algebraic principles by embedding a division algebra into 
a matrix ring Mfc(H) of the Hamiltonian quaternions. 
All this is in strong contrast to the previous ad hoc 
constructions of fast-decodable codes that have been 
specific to a certain number of antennas and lacking an 
obvious generalization. The proposed codes furthermore 
enjoy the NVD property, a property that no other fast- 
decodable MIDO code found in the literature has been 
proved to have. 

We mainly considered the 4 x 2 MIDO case suitable 
for DVB-NGH, but also provided constructions for the 
6x2 and 6x3 cases. The explicit embeddings obtained 
in these situations were shown to be fully generalizable 
to any even number of Tx antennas. Simulations were 
presented to show that the performance of the proposed 
codes is comparable to the best known MIDO codes. 
The achieved complexity reduction up to 37.5% is also 
among the best known for the MIDO channel. 

In addition, a complete solution to the discriminant 
minimization problem for division algebras with arbi- 
trary centers was given. As an application a normalized 
minimum determinant bound for code lattices in Mfc(H) 
was derived from the algebraic results. 
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Appendix 

In this Appendix we are going to present some basic 
results from the theory of central simple algebras and 
in particular from the theory of Hasse-invariants. These 
results are needed only in Section VI. 

For a quick introduction we refer the reader to [16] 
and [35] where similar optimization has been done. 

Let us consider a If -central division algebra of index 
n. Then attached to each pair (A, P), where P is a prime 
of if, is a positive rational number hp = a/mp, the so- 
called Hasse-invariant of A at P. The Hasse invariants 
of A fulfill the following. When P is a prime ideal of 



if, then 

hp = , < a < mp < n, (a,mp) = 1, 

mp 

when P is infinite and real, then 

hp = 1/2 or hp = 0, 
and when P is infinite and complex, then 
hp = 0. 

The number mp is called the local index at prime P (see 
Section V-A). We say that the algebra V is ramified at 
the prime P, if hp ^ 0. The Hasse invariants define the 
algebraic structure of a division algebra and in particular 
the discriminant of the algebra. 

Proposition 12.1: Assume that Pi, . . . , P s are a set of 
finite prime ideals of Ok and P s +i, . . . , P n are a set of 
real primes. 

Assume further that a sequence of rational numbers 

ai a s a s+ i a n 

m Pl ' " ' m Ps ' m Ps+1 ' " ' ' m Pn ' 

subject to the restriction that when i > s, a,i/mp i = 1/2, 
satisfies 

i=l 1 

1 < cti < mp t , and (ai,mp.) = 1. 

Then there exist a If -central division algebra A that 
has local indices mp % and the least common multiple 
(LCM) of the numbers {mp,} as an index. 

If A is a maximal Ox -order in A, then the discrimi- 
nant of A is 

d(A/0 K ) = l[P t 
i=i 

We have the following two general results. 

Theorem 12.2 ([16]): Let us suppose that we have a 
number field if and an integer n, where 4 | n or 2 \ n. If 
Pi < P2 is a pair of smallest primes in Ok, then there 
exists a if -central division algebra of index n having a 
maximal order with the Ox-discriminant 

(PiPa)^"- 1 ). 

This is the smallest possible discriminant for an order 
inside any if -central division algebra of index n. 

The following result is from [35], but is presented here 
for he first time in an article. 

Theorem 12.3 ([35]): Let A be a if -central division 
algebra of index 2k = n, where k and 2 are relatively 
prime and let Pi < P2 be a pair of smallest primes in 
Ok- 
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If K has at least two real primes, then there exists a 
K-central division algebra of index n having a maximal 
order with the discriminant 

( Pl P 2 )fc(fc-i). 

If K has only one real prime P^, then there exists a 
iT-central division algebra of index n having a maximal 
order with the discriminant 

pn(ra-l) pfc(fc-l) 

This is the smallest possible discriminant of all orders 
of index n division algebras with center K. 

We have now given completely general discriminant 
bounds for any center and for any index n. 

Proposition 12.4: Let A be a K-central division alge- 
bra of index n, 2 | n, where K is a totally real number 
field, and let P\ < P2 be a pair of smallest primes in K. 
Let us suppose that all the infinite primes are ramified 
in A. 

If 2 1 1 n and 2 | [K : Q], then the minimal discriminant 
of A is 

( Pl P 2 )fc(fc-i). 
If 4 I n then the minimal discriminant of A is 

If 2 1 1 n and 2 \ [K : Q] , then the minimal discriminant 
of A is 

p n(n-l) pfc(fc-l) 

Proof: In the proofs of Theorems 12.3 and 12.2 the 
general strategy was to choose a set of H-invariants that 
will yield an index n division algebra (see Theorem 12.1) 
and then prove that our choice was the best possible. We 
will use the same strategy here, but the difference is that 
we can do the optimization over division algebras that 
are totally ramified at infinite primes. 

The assumption of ramified infinite primes always 
gives us m non-trivial Hasse invariants {hp,, . . . , hp m }, 
where hp. = | and Pi are all the infinite primes in K. 

The Hasse-invariants at infinite places do not con- 
tribute anything on discriminant of the division algebra. 
If we have an index n division algebra, the contribution 
of a Hasse-invariant hp = where mp is the 

local index at finite prime P, to the Ok -discriminant 
is p ( - mp ^ ™p . Therefore in most cases we can simply 
prove the minimality of the corresponding discriminant 
by showing that, despite the extra ramification at infinite 
primes, we can choose a set of Hasse-invariants that will 
give us an index n division algebra with a discriminant 
reaching the bound 12.3 or 12.2. 

In Table III we have collected the Hasse-invariants (at 
finite places) of the algebras we claim to be optimal. 



TABLE III 



index 


[K:Q] 


H-invariants at finite places 


odd 

4k 

4k 

2k, 2 \ k 
2k, 2 \ k 


odd 
even 
even 
odd 


h — 1 h — 2k - 1 
n Pi - 4p h Pi - ~4k~ 

hp i = jk' hp 2 = ~if 
hp, = t, hp 2 = \i 

u — fc-2 L _ 1 

n pl - -^-> h p-l - k 



In addition to what is said in the table about the H- 
invariants at the finite places, we suppose that each of 
these algebras have H-invariants \ at all the infinite 
primes. By a direct calculation we can see that in each 
case we get a division algebra of index n with all the 
infinite primes ramified. This will take care of the first 
two claims of the proposition. In the first case, where 
2 j I tt, and 2 | [K : Q], the division algebra given in the 
table will reach the claimed bound which coincides with 
the general bound in 12.3. In the case 4 | n the algebras 
given in Table III reach the bound 12.3 and we are done 
with the second claim. 

We are left with the case, where 2 1 1 n and 2 \ [K : 
Q] = m. In this case the problem is that while the sum 
of the m — 1 first infinite Hasse-invariants is an integer, 
there is still one extra infinite H-invariant hp m = ^ we 
have to take care of. Therefore we are forced to choose 
Hasse-invariants hp, = 4^ and hp 2 = ^ for the finite 
places. The proof that this set of Hasse-invariants will 
give us the optimal discriminant is verbatim the same as 
it is for the case where the center has exactly one real 
place. This case was dealt in the proof of Proposition 
12.3 and we refer the reader to [35]. 



